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NOTATION 


x € A means that x is a member of the set A. 

A — >D means that A implies B. 

t 

{x|x has property a} denotes the set of all x such that x has 

property A. 

C:A •> B means that the operator (or function) G maps the set A into 
the set B. 

• . . dx 

v denotes ~ . 

* dt 

He z denotes the real part of the complex number z. 

Im z denotes the imaginary part of z. 

x < 00 means that x is finite. 


x(t) _ a means that.x(t) = a for all t, 

Him x(t) = a , or x(t) -*■ a as t 00 , means that for all p > 0 there 
t "" is a T such that |x(t)-a|<p 

for all t 5 T . 

x denotes the supremum (or least upper bound) of the set of 

>:*. A numbers A, i.e. the least number y such that x ^ y for all x E A. 

if-f x denotes the infimum (or greatest lower bound) of the set of 

A numbers A, i.e. the greatest number z such that x £ z for all 

x c A. 

C? P C" is the function on the real line defined by 
‘ J-a, c < 0 

stp a = J 0, cr - 0 

(-a,b) ( b, cr > 0 . 

C is the function defined by 
(“«,!;)• sod o = cr stp a . 

(~a,b) (-a , b) 

. denotes the set of all real numbers. 

r t. v. 

denotes the set of all mxn real matrices. 

', n , 

aenoies the set of all n-dimensional real vectors. 

* ra.rLx i;; denoted A, a column- or row-vector is denoted b. 
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ON THE ALGEBRAIC STRUCTURE OF BILINEAR SYSTEMS 

Roger W. Brockett 

Harvard University 
Cambridge, Massachusetts 


Abstract 


(M \ S<S /n/ G~ 


' Preliminary i 

Following a general theme in the mathematical theory 
of model building, our concern here is with the relation- 
ship between external (often emperical) descriptions of 
dynamic system, and internal (for us a description in terms 
of differential equations) descriptions of the model. We 
refer to the latter as a realization of an input-output 
system. The system itself is thought of as a collection 
of input-output pairs. 


This work was supported in part by the U.S. Office of 
Naval Research under the Joint Services Electronics Pro- 
gram by Contract N00014-67-A-029 8-0006 and by the National 
Aeronautics and Space Administration under Grant NGR 22- 
007 - 172 . 
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We want to describe a theory which is general enough 
to treat systems of the form 
mm 

x(t) = (A+ l u. (t)B )x(t)+ l u (t)b ; y(t) = c[x(t)] 
i=l 1=1 

where A and are square matrices, the b^ are column 
vectors and c[x] is a finite power series. This departure 
from linear systems, i.e. systems for which the u^ftJB^ 
terms are absent, and c is linear, is justified on the 
grounds that a number of practical control problems can 
only be modelled successfully if the multiplicative control 
and output nonlinearity are present. The reas ons for this 


t^TT'A Mv-sS/nI Cr 


\ 

\ 

| 

For bilinear models the tools of linear algebra are no 
longer enough. There is a simple explanation of this fact. 

In order to decompose the system equations as completely as 
possible, it is necessary to develop canonical forms for a 
set of matrices which' admit both linear operations and a 
type of multiplication. A form which is convenient relative 
to the vector space structure of the set of matrices 
typically is not well behaved relative to the multiplicative 
structure and conversely. To sort this all out requires 
more than just linear algebra. For reasons having to do 
with controllability, the useful multiplication rule is 
[A,B] = AB-BA. The study of bilinear systems is ' intimately 
connected, therefore with the study of sets of matrices 
which are closed under vector space operation and also the 
above multiplication. These objects form Lie algebras and 
if we are to make reasonable progress in understanding bi- 
linear systems, this theory cannot be avoided. 



-3- 


Examples leading to bilinear constraints include 
those where energy is to be conserved. If x must satisfy 

x'Qx = 1 

then we may model a controlled system by 

m 

x(t) - (A + l u i (t)B i )x(t) 
i=l 

where QA + A'Q = 0 and QB ± + B|Q = 0. 

Higher order constraints can also be accommodated. Let 
V-,V„,...,V and W be vector spaces over the same field. 

A x. c. 

map 


<j>: V, x V, x ... x V -* W 
12 r 

is called multilinear if it satisfies, for all a and B in 
the field and all i = l,2,...r. 


<J>(v V v 2 , . . . .ocv^Bv^, . . . ,v r l ,v r ) 

= a<j)(v 1> v 2 , . . . , v ± , . . . ,v r _ 1 ,v r )+B<{)(v 1 ,v 2 , . . . ,v^, . . . t v lt v ) 


Given a multilinear form <f> : x 3% n x . . , x <5?, 

suppose the constraint to be satisfied by x is 

<j>(x,x, . . . ,x) = 1 

Let the equations of motion be 

m 

x(t) = (A + l u (t)B.)x(t) 

1=1 

This imposes the conditions on A and 

L(Ax,x, . . .x)+L(x,Ax, . . .x)+. . ,+L(x,x, . . .Ax) = 0 
L(B^x,x, . . .x) + L(x,B^x, . . .x)+. . ,+L(x,x, . . .Bj,x) » 0 

Specific instances which require both the additive and 
the multiplicative terms have been given in the literature 
[1], One large class of problems of this type arise in the 
study of switched electrical networks, examples of which 
appear in [2] and [3]. The bilinear form is of basic 
importance in certain problems having a geometrical com- 
ponent due to the Frenet-Serret formulas for curves in a 
3-diraensional space. 
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The Basic Bilinear Model 

We want to show that a large class of input-output 
models can be reduced to the form 
m 

x(t) = (A+ l u i (t)B i )x(t) ; y (t) = Cx(t) (I) 

i=l 

where x is an n-typle, y is a q-tuple and A, {B^} and C are 
matrices of appropriate dimensions. 

We begin with a simple observation. (Compare with 
[2] section 7 and [3] section 4.) 

Theorem 1 : Any input-output map which can be realized by a 

set of equations of the form 

m m 

x(t) = (A+ l u (t)B )x(t)+ £ u (t)b. ; y(t) - Cx(t) 

i=l i=l 

can be realized by a set of equations of the form 

m 

z(t) ■ (F+ l u 1 (t)G i )z(t) ; y (t) « Hz(t) (I') 
i®l 

Proof : Let F and G. be defined by adding a single extra row 

and column to A, and B. respectively 



It is immediate that the z-system defines the same input- 
output map as the x-systera. O 

The second result is a little more involved. It 
shows that nonlinear output maps can be reduced to linear 
forms provided they are of the finite power series type. 

This is the kind of result that has no counter part in 
linear theory and points out the great flexibility inherit 
in the bilinear model. The basis for the result is the 
observation (which goes all the way back to the thesis of 
A.M. Liapunov) is that if x satisfies a linear equation then 
x(t)x'(t) satisfies one also. Thus in our case if x satisfies 


I 
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(I) then (prime denotes transpose) 

, ra m 

— x(t)x' (t)=(Af l u i (t)B i )x(t)x ' (t)+x(t)x'(t) (A+ l u i (t)B)' 
i=l i=l 

which is an equation of the bilinear typel That is, there 

2 

i 


exist matrices A 

2 2 
x-,x„x,, . . . ,x ) satisfies 
2 2 3 n 


and B) *’ J such that z= (XpX^^jX^x^, . . . 


z(t) = (A t2] + l u.(t)B [2] )z(t) 
i=l 

[21 r 21 

Of course A and B' are derived from A and B^, res- 
pectively. One can Be more explicit using Kronecker product 
relationships and the theory of symmetric tensors [4J. The 
same is true not only for {x^x^} but also {x^x^x^} etc. as 

is easily verified. Thus associated with each bilinear 
equation is a countable collection of bilinear systems. The 
mth entry in this collection being the bilinear equation for 
the mth-degree forms in x. It can be taken to be of dimen- 
sion equal to the number of linearly independent m-forms in 
n variables, i.e. n(n+l) . . . (n-hn-1) /2. 3- . . .m. We indicate 
the vector consisting of these forms (ordered lexographic- 
ally, for the sake of definiteness) by x^-L 


Theorem 2 : Any input-output map which can be realized in 

the form 

m q 

x(t)=(A+ i u (t)B )x(t) ; y (t)= I L (x(t),x(t),...,x(t)) 

nP 




p°i 


where L is a p-linear map can be realized in the form 
" m 

z(t) « (F+ l u (t)G )z(t) ; y(t) « Hz(t) 
i=l 

Proof : It is clear from the previous remarks that if x 

satisfies a bilinear state equation then so does x^ m ->. Thus 
we can write an equation of the form 

ra 

z(t) ° [A + l u i (t)B i ]z(t) 
i=l 

where z is defined by 


z 

m 


<x.x' 2 >. 


.*'»>) 


and [A + Y u. (t)B. ] is given by 
i=l 1 1 
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Afu^tjB 0 ... 0 

0 A l21 +u ± (t)B^ 2 ^ ... 0 

0 0 A^ q ^+u i (t)Bj q ^ 

Now y is a linear combination of the components of z since 
it is multilinear in the components of x. □ 

Example ; The reader may verify that the input-output 
system defined by 

2 

x ■ u ; y “ x 

is represented by 



1 


0 

0 

0 

0 

0 

0" 


~ 1 


X 


0 

0 

1 

0 

0 

0 


X 

d 

X 


u 

0 

0 

0 

0 

0 


• 

X 

dt 

2 

X 


0 

0 

0 

0 

2 

0 


2 

X 


XX 


0 

u 

0 

0 

0 

1 


XX 


.2 

X 


0 

0 

2u 

0 

0 

0_ 


.2 

_x 



y a 

[0 

0 

0 1 

0 

0]x 




System Interconnection 

We say that two bilinear systems (I) and (I*) are 
interconnected in parallel to get the single system if we 
simply add these outputs. That is, the equations for the 
parallel inter-connection are 
n 

x(t)=(A+ l u 1 (t)B i )x(t) 

i=l 

; y (t) *» Cx(t)+Hz(t) 

z(t) = (F+ l u^OG^zCt) 
i=*l 

Clearly this is defined only if the dimensionality of the 
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input spaces of I and I* are the same and the dimensionality 
of the output spaces of the two systems are the same. 

We say that two bilinear systems are interconnected in 
series with (I*) following (I) if the input to (!') is equated 
to the output of (I) the equations for the series inter- 
connection are 

m 

x(t) = (A + l u i (t)B i )x(t) • 
i<=l 

; y (t) » Hz(t) 
in 

z(t) = (F + J(Cx) (t)G )z(t) 
i=l 

Clearly a series connection is possible if the dimension of 
the output of the first system equals the dimension of the 
input of the second. 

Remark : If the series interconnection of two input-output 

systems having bilinear realizations is defined then the 
system which results from parallel interconnection has a 
bilinear realization. If the series connection of a system 
having a bilinear realization followed by a system having a 
linear realization is defined, then the system which results 
from series interconnection has a bilinear realization. 

We have not been able to determine if the class of. 
bilinear realizations is closed under series inter- 
connection. 


The Canonical Form 

The existance of the Jordan normal form for a linear 
map of into gives rise to the "diagonal" or "partial 
fraction" realization for linear systems. This is important 
because in certain senses the Jordan form displays the max- 
imum degree decoupling which is possible. We want to des- 
cribe the analogous situation for bilinear systems. As 
might be expected, the results cannot be based on the tools 
of linear algebra alone. 

In view of the results of section 2, we are content to 
consider hence forth systems which have realizations in the 
form of equation (I). 
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We call two realizations 


x(t) ** 

m 

(A+ l u (t)B x(t) 

; y(t) Cx(t) 

(I) 


i«*l 



and 

m 



z(t) » 

(F+ l u (t)G)x(t) 

; y(t) ° Hz(t) 

(I’) 


i=l 



equivalent 

if there exists a 

nonsingular P such 

that PAP 


F and PB^ 1 = G ± and CP _1 » II. 

We call a realization in the form (I) irreducible if 
there is no nonsingular P such that 



where and are square matrices, all of the same dim- 
ension. That is, for no choice of basis is the realization 
in block triangular form. Otherwise we call it reducible . 

A reducible realization is said to be completely reducible 
if it can be put in block diagonal form (as opposed to 
block triangular form) with each block being irreducible. 

A realization of the form (I) said to be equivalent to a 
triangular realization if there exists a nonsingular P 
(possibly complex) such that PAP - 1 and PB^P - ^- are lower 
triangular. (Including the possibility or nonzero elements 
on the diagonal.) We call it strictly triangular if there 
exists P such that PAP - -*- and PB^P - - 1 - are strictly lower tri- 
angular. (No nonzero elements on the diagonal.) 

If a system is reducible then there are nontrivial in- 
variant subspaces for the collection of matrices (A, B^ }. 

Let V]_ be one of these which is of smallest (positive) dim- 
ension. (There may be many, pick any one.) Let V2 be a 
smallest invariant subspace properly containing V]_. Let V3 
be a smallest invariant subspace properly containing V2, etc. 
Let n± - diraV^. Pick a basis such that the first n^ 
elements span the space V]_, the first n£ elements span V2, 
etc. Relative to this basis the matrices A and B^ take the 
block triangular form 



-9- 


' A ll 

0 

0 . . .*■ 


" B ll 

■ o 

0 . . ." 

A 12 

A 22 

0 

B . ** 

l 

p 4 

B 22 

0 

A 13 

A 23 

A 33 

B n 

B 23 

B 33 

• • 

• • 

* * * * * J 


• • 

• • • 

J 


Each of the collection of block diagonals are 

irreducible and the Jordan-Holder Theorem insures that these 
representations are unique in that regardless of how the 
invariant subspaces are chosen, the construction will lead 
to an equivalent collection of irreducible diagonal blocks. 
(They may occur in a different order depending on the choice 
of subspace, of course.) We collect these observations in 
a theorem. (See, e.g. Samelson [4] page 12 for a sketch 
of a proof.) 

Theorem 3: Every bilinear realization (I) is equivalent to 

one in which the A and B. matrices are in block triangular 
form with the diagonal blocks being irreducible. Moreover 
if (A,B^,C) and (F,Gi,H) are two equivalent realizations in 
block triangular form with irreducible blocks on the diagonal 
then there is a permutation it and nonsingular matrices Pk 
such that the diagonal blocks are related by 

P k\k P k " F ir(k)TT(k) ; P k B kk P k ° G T(k)TT(k) 

We will say that an input-output system displayed 
according to the above recipe is in a reduced form . 

Controllability 

A detailed study of the controllability properties of 
bilinear and even more general systems, has been made in the 
recent literature. References [5] - [ 8 ] contain many inter- 
esting results. For our present purposes section 7 of [2] 
and section 6 of [8] are relevant. 

In reference [2] it is shown that if A is zero, or if a 
certain commutation condition is satisfied, then the reach- 
able set for 

m 

x(t) = (A+ l u. (t)B.)x(t) ; y(t) = Cx(t) ; x(0) - x (I) 
i=l 1 1 0 
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is easily computed. However Jurdjevic and Sussmann [ 8] 
have shown that the reachable set for (I) contains an open 
subset of the set reachable for 
m 

x(t) = (v(t)A+ l u.(t)B )x(t) ; y(t) a Cx(t) ; x(0) =■ x (II) 
i-1 . • 

From this fact it is easy to show that the reachable set 
for (I) is confined to a subspace if and only if the reach- 
able set for (II) is confined to the same subspace. We omit 
the details but make explicit use of this result below. 

Theorem A; The reachable set for (I) is confined to a sub- 
space if and only if there exists a nonsingular P such that 



where the 0 blocks are all of the same dimension. 

Proof ; If there exists such a P then clearly the reachable 
set is confined to the subspace consisting of those vectors 
whose upper portion is zero. 

Suppose the reachable set of (I) is confined to a sub- 
space. Then by our remarks above the reachable set for (II) 
is confined to a subspace. But from the results of section 
7 of [2] we see that this implies that A and B^ can be simul- 
taneously block triangularized. O 

Remark : Notice that the set of matrices {A,B^} can be simul- 

taneously triangularized if and only if one can simultaneously 
triangularize the larger set obtained from {A,B^} by adjoin- 
ing all linear combination products of any two elements, 
products of products, etc. More precisely, we define {A,B^}^ 
to be the smallest vector space of matrices which contains 
{A,B.} and is closed under multiplication by elements of 
{A,B^}, This larger set is called the associative algebra. 
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generated by (A,Bj[). The condition of Theorem 4 can be 
stated as requiring that x Q should not belong to any subset 
of which is invariant with respect to multiplication by 
elements of the associative algebra. This statement is close 
to the familiar (B.AB.A^B, . . . ) test for controllability. 

Theorem 5 : Any input-output map which can be realized by a 

bilinear system can be realized by one for which the reach- 
able set is not confined to a linear subspace. 

Proof : Use Theorem 4. If the reachable set is confined to 

a subspace find the P which effects the decomposition for 
Theorem 4. Delete the top block, then the input-output sap 
is the same but the state is not confined to a linear sub- 
space. ° 

Observability 

We will say that two starting states, x q and of the 
system u 

m 

x(t) = (A+ l u j ,(t)B 1 )x(t) ; y (t) = Cx(t) (I) 

i=l 

are indistinguishable if for all inputs u, the response y is 
the same. This follows our approach in [2] where more gener- 
al output maps are considered. We start off the analog of 
Theorem 4. 

Theorem 6 : The system (I) has no indistinguishable states 

if and only if there exists no nonsingular P such that CP~^, 

PAP \ and PB^P ^ take the form 


CP 1 = [C, 0] 



Proof : Clearly if such a P exists then the system is not 

observable since x q = (0,x) implies y = 0. 

On the other hand, if there exists two indistinguishable 
states then there is a hyperplane of indistinguishable 
states. Hence 

C$>, A . v „ JX = 0 

(A+Zu i B i ) 
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for some subspace 3C. Let x be in then x belongs to the 

kernel of C. Thus we may characterize C/C as the largest 

subspace of the kernel of C which is invariant under the 

action of _ N . If such a subspace exists then there 
(A+Eu^B ) 

exists a choice ofTjasis such that (A,{B^},C) has the form 
indicated. O 

The remark following Theorem 4 is relevant here as well. 

We now give the observability version of Theorem 5. 

Theorem 7 : Any input-output map which can be realized by a 

bilinear system can be realized by one for which there are 
no indistinguishable states. 

Proof : Use Theorem 6. If there are indistinguishable states 

then triangularize the system and delete the lower part of 
the systems. If the resulting system has indistinguishable 
states repeat the operation until there are no more indis- 
tinguishable states. D 


Example : We can apply these results to a linear system 

with a linear or power law output. The n-dimensional scalar 
input, scalar output system 


2 

x ** Ax + bu ; y *» (cx) ; x(0) » 0 

takes the form 


_d 

dt 


"1 


o 
o 
o 
1 


"I ' 

y - [0, 0, C] 

"1 ‘ 

X 

o 

ub A 0 


X • 

J • 

X 

[2] 
Lx ;] 


i 

o 

c 

w 

fo 


[2] 
_x J 


[2] 

_x 


(*) 


Now if (b, Ab,...,A n H>) is of rank n « dim x, then there Is 
no vector space which contains the reachable set for the 
realization (*) . The observability criterion can be applied 
to show that this systeij has no distinct indistinguishable 
states if c; cA;...cA ) is of rank n and cA^-b is nonzero 
for some i. 


*Here and above $ with a subscript refers to a transition 

matrix associated with a linear system. See [9] section 4. 
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Equivalent Realizations 

The state space isomorphism theorems for automata and 
linear systems are well known and of basic importance in 
these fields. Recently theorems of this type have appeared 
in other settings, for example [2 3 and [10]. Here we want 
to describe such a result for bilinear systems. 

In this section we show that any two bilinear realiza- 
tions of the same input-output may differ at most by a 
change of basis provided some natural minimality conditions 
are satisfied. 

Let us agree to call x 0 an equilibrium state of the 
bilinear system 
m 

x(t) = (A+ l u (t)B.)x(t) ; y(t) = Cx(t) (I) 

i=l 

if Ax q vanishes. This is the same as asking that x Q be an 
equilibrium solution of the differential equation which 
results when all the u^ are set to zero. 

Theorem Suppose that we are given two realizations of 
the same input-output map 
m 

x(t) = (A+ l u (t)B ± )x(t) ; y (t) = Cx(t) ; x(0) = x 
i=l 
m 

z(t) «* (F+ £ u (t)G.)z(t) ;y(t) = Hz(t) ; z(0) = z 
i=l 1 ° 

Let x q and z be equilibrium states. Suppose that both 
systems are observable in that any two starting states can 
be distinguished for a suitable choice of u and suppose that 
the systems are controllable in that the reachable set from 
x or z is not confined to any proper linear subspace. 

T$en the two realizations are equivalent. 

Proof : Let the z-system be of dimension n. Without loss of 

generality we can assume the x-system is of dimension less 
than or equal to n. Let u^,u^,...,u n be controls -which are 
defined over the intervals [0,ti], [0, 1 2 ] > • ■ • [0 , t n ] which 
result in z-trajectories z^- , z^, . . . , z n . Let t^ be the larg- 
est of the t's and define u^,u2, ...,u on [0,t A ] by shifting 

the u* to the latter portion of the interval and filling in 
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on the first portion with 0. 


u i (t) 


0 £ t < t A - t ± 


U (t-t A rt i ) t^-t. « t € t* 


Let. z be the resulting trajectory in the z system. As a 
result of the assumption that z q is an equilibrium state 

. { z 0 f t « t.-t. 

« (t> -■ I i ° 

(S- (t-t^+t.) ; t A - tjL <t*t* 

Let x* be the trajectory which the x-systems generates under 
the control u^. Because both systems generate the same 
input-output map we have 

C V+Iu B ) ^ 1 (t*),x ? (t /; ) > ...x n (t A )) 
i i 

c ^(F+Zu^G^^ (**)»** (t^),...Z (ty c ) ) 

where <J> /A , v „ . and _ . are the transition matrices 

(A+Zu i B i ) (F+Zu^) 

which result from an arbitrary control u. 

* > 'l. 2 

Now the matrix Z = (z ( t^) , z (t^),...,z (t y .)) is non- 
singular by construction. If x-systera is not of the same 
dimension as the z-system, or if the matrix X - (x^t*), 

2 

x (t*) , . . . ,x (t A )) is singular then there exists a nonzero 
vector n such that Xn = 0. 

c^/..r „ sXn ° r _ . zn => o 

(A+Zu^B^) (F+Zu^i) 

Thus Zri is a starting state for the Z-system which is 
nonzero but equivalent to 0. This violates the. observability 
hypothesis. Thus X must be a square matrix which is non- 
singular. 

Since we have for all u 


' i~i,' 


H$ (F+Iu G.1 Z 
' i r 
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and since I is certainly a possible transition matrix 

CXZ -1 » H 

Moreover, since no two states give rise to the same input- 
output map, the equality 

r> \ = cxz -1 *,^ _ N zx -1 

(A+Zu i B i ) (F+Zu i G ;j ,) 

implies 


*, A . r o \ “ XZ _1 <Ii _ n nZX' 

(A+Iu i B i ) (F+Zu i G i ) 

From this it follows that for P = XZ - ^- 

A » PFP _1 

B i " P V " 1 

and from above 


This result can also be used to establish isomorphism 
theorems for realizations in inhomogeneous form. That is, 
two realizations of the form 

tn 

x(t)*=(A+Zu. (t)B )x(t)+ l u. (t)b y (t) = Cx(t) 

i=l 

can be shown to differ only by a choice of basis provided 
the appropriate minimality conditions are satisfied. 


We point out that in actually determining equivalent 
realizations for systems and in the classification of 
systems, the results available in the study of Lie algebras 
(e.g. [4]) are of fundamental importance. Some recent work 
relating Lie algebras and system theoretic ideas is re- 
ported in [11]. 


Conclusions 

In this paper we have shown that a particular bilinear 
model is both quite general and easy to work with. Build- 
ing on previous results we have shown how to get a basic 
structure theory. There are many more specific problems 
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which can be examined using these tools. Some of these are 

under investigation and will be reported on soon. 
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PREFACE 

The theory of differential equations and control have been 
linked very closely because most of the early applications of con- 
trol theory were to engineering problems of the type which are most 
naturally described by ordinary differential equations. The 
questions of importance in control have helped to revitalize cer- 
tain problem areas in differential equations and methods and tools 
from control have been useful in obtaining new results in differ- 
ential equation theory. On the other hand, going back to the era 
of Lie himself, there has been close ties between Lie theory and 
differential equations. Thus it is not surprising that one finds 
that Lie theory and control are also closely connected. This 
"triangle" is the subject of this set of notes. 

In control theory, Lie algebras make their appearance as Lie 
algebras of vector fields. Topological properties associated with 
Lie groups show up in the study of controllability and stability. 
Partial differential operators arise in the Fokker-Planck equations 
modeling the uncertainty of the environment and our uncertainty 
about the measurements we make of it. The problems which are of 
interest in control frequently require a generalization of the 
usual treatment of topics such as existence of geodesics, express- 
ions for the spectrum of the Laplacian etc. The modification is, 
roughly speaking, to include the possibility of a metric which is 
"infinite" in certain directions, subject only to the condition 
that the directions along which it is finite can be combined in 
such a way as to make the distance between any two points finite. 
These notes contain a brief account of some of these topics, to- 
gether with references where complete proofs can be found. 

I have included a few exercises for the reader, both to indic- 
ate some results which do not exactly fit the format chosen here 
and to indicate some partial results and suggestions on additional 
problems of interest. Most of the examples are to be found in the 
exercises as well. 

It is a pleasure to thank Prof. David Mayne for organizing 
such a stimulating forum for the exchange of ideas on system theory. 


I. THE ALGEBRAIC THEORY OF LINEAR DIFFERENTIAL EQUATIONS 

1.1 Lie Algebras and Linear Differential Equations 

Clearly any linear differential equation of the form 

x(t) » A(t)x(t); x(t)e[R n 
can be expressed as 

*<t) - ( l 
i-1 


u i( t ) A i )x(t) 
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with the constant matrices and the Ui(t) scalar functions of 
time. In view of the fact that the solution of the equation with 
a single A^, i.e. 

x(t) = u(t)Ax(t) 
is rt 

A u(o)da 

x(t) - e ® x(0) 


the question arises as to when the solution of the general problem 
can be written as the composition of a number of such solutions 


Aj.8j.Ct) A 2 g 2 (t) 


A g (t) 

.e m m x(0) 


for a suitable choice of the g^(*)» Otherwise stated, we would 
like to know if the solutions of the matrix differential equation 


X(t) « ( l u i (t)A i )X(t) ; X(0) - I (identity) 

i.°*l 

can be written as . 

A g (t) A 2 g 2 (t) A m g m (t) 

X(t) e e . . .e 


for a reasonably wide class of u. (t) and over some interval of time, 
say |t| < e. 


The above question is basically answered by a classical theorem 
of Frobenius [1]. However the theorem of Frobenius which applied 
here is a theorem in differential geometry. To use the insight 
of his result we need to look at the problem posed from a geometrical 
point of view. Consider the identity matrix as a point in the set 
of all nonsingular n by n matrices. Suppose that the one parameter 

curves e Ait leave the identity as indicated in figure 1. 



Figure 1: Neighborhood of I in the set of all n by n matrices 


We regard the set of all points ^f^the form 

S'- {X : X » n e 1 i ; a. e p? > 
i-1 

as a subset of the set of all nonsingular n by n matrices. Our 
question is, when do the integral curves of the given matrix differ- 
ential equation corresponding to a wide class of u.(«) lie in S? 

In order for this to be true for all piecewise continuous u's we 
require, for example, that 

A 1 t. A 2 t -Ajt -A 2 t 
e e e e 
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be expressible as an element of S. To see why this is so we point 
out that the choice 


( - 1 

t 4 a <2t 


u. (a) = { 0 

0 « a < t; 

2t € a < 3t 

1 l i 

3t c a < 4t 


(- 1 

0 c a < t 


u.(a) = < 0 

t * a < 2 1 ; 

3t < a < 4t 

2 ( i 

2t £ a < 3t 


u £ (o) - 0 

i > 2 



yields 


At A-t -A.t -A,t 
X(4t) = e e e e 


Geometrically, what we are asking is that in following the 4-sided 
path shown in figure 2 we should not be lead out of the set S. 



More generally if f^ and f 2 are smooth maps of |R n into lR n 
and if we apply the above choice of u(-) to the system 


x(t) - u x (t)f [x(t)]+u 2 (t)g[x(t)] ; x(0) - x q 

then a slightly messy calculation shows that to second order in t 
we have 


x(4t) - x + 


{( M, 

u 3x' 


X“X r 


bCXq) 


- & 


( tS ) x-x„ £tX o >)t/ 


The quantity g(x)- ^ f(x) is usually written as [f,g] and is 

called the Lie bracket of f and g. One calls a set of vectors 
f-£:\R n -MR n involut ive if the Lie bracket of any two is a linear 
combination of the {f j.} . Frobenius showed that the set of points 
near x 0 which can be reached from x 0 along integral curves of 

m 

x(t) = l u. (t)f (x) 
i =1 

with {f^} involutive can be expressed as 


^m^m’ ‘ * *^3^3* • • •) 

where <t>., (t,x) are the solutions of 

x(t) = f t [x(t)] 
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The reason the set {f^} must be involutive is that otherwise the 
special choice of u(*) outlined above will, for small t, surely 
lead out of set of points expressible as ( c m > $ • • •<Kt* x 0 ) ) •-)) ■ 

Applying this type of thinking to the linear case, we see first 
of all that the Lie bracket of Aj_x and A 2 X is [A]x, A 2 x]=* (A]A 2 -A 2 A^)x 
That is, the Lie bracket o f the vector fields is expressible as the 
commutator of the matrices. We write [Ai,Aj] for A-jAj-AjA^. Thus 
if the set of matrices { A^} have the property that 

m 

I y* 

k-l 

then the theorem of Frobenius would imply that for small 
write 


[ A^ , A^ ] 


ijk^k 


we can 


X(t)x 


m 

n 

i“l 


A ± g i (t) 
e x 


A linear space of square matrices which is closed under [•,•] is a 
matrix Lie algebra. Of course if the original set {A^} does not 
form a basis for a Lie algebra we simply supplement it with addition- 
al A's until it does. If x is of dimension n then there are only 
n^ linearly independent matrices so this process always results in 
a finite set. > 

Wei and Norman [2] have given a direct verification of the 
above representation based on the implicit function theorem and have 
developed a set of nonlinear differential equations for the gi(*). 

The basis for their derivation is the Baker-Campbell-Hausdorff 
formula 

e A Be _A - B+[A,B]+ - [A,[A,B]+-^ [A, [A, [A,B] ] . . . ] 

Thus if one assumes a solution of the form 

A g (t) A-g,(t) 

„ , _ . „ 11 l i mm 

X(t)X = 0 0 • • #0 

O 

and then differentiates, the result is 

. A g (t) A.g (t) A g (t) 

X(t) - A 1 g l (t)e 11 e 2 2 - " " 


. . .e 


+ e 


A l g l<t) 


A 2 g 2 (t)e 


A 2 g 2 (t) 


.e 


Sn 8 m <t> 


+ e 


A^Ct) A 2 g 2 (t) 


. . .A g (t)e 
m m 


A g (t) 
m m 


Now we must collect all the A's together at the left in order to 
compare this expression for X with that given by the differential 
equation. The Baker-Campbe3 1-Hausdorf f formula provides the means 
to do. this. To see how this happens, observe that by inserting 
-A i g i (t) A i g i (t) 

e e freely we can arrive at 
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• . A l 8 l (t) " A l 8 l (t) . A l 8 l (t) A 2 8 2 (t) A l S l (t) 

g 1 A 1 +g_e A„e +...g e e . ..A . ..e 

ii / z in in 

■ A 1 u 1 (t)+A 2 u 2 (t) + .... + Aju^Ct) 

We apply the Baker-Campbell-Hausdorf f expansion to each term on the 
left. If the set {A^} is a basis for a Lie algebra then we can 
express the result as a linear combination of the A^. Since the Aj[ 
are linearly independent we can equate coefficients on each side 
and thereby get a set of differential equations for the g^. It is 
important to note that the differential equations for the g^ only 
depend on the A^ through the commutation rules 

Wl, V ■ 

Thus when a differential equation is solved by this method a whole 
class of differential equations are solved at the same time — one 
for each set of A's which satisfy the given commutation relation. 


Exercises 

1. Show that if the A^ in 

in 

X(t) =* l u. (t)A.X(t) 
i=l 1 

are all upper triangular then it is possible to express the solution 
of the differential equations for the g^(-) explicitly in terms of 
integrals. 


2. Show that the smallest Lie algebra of matrices which contains 
Aj and A 2 


- (S i) ; * 2 -(?°o) 


is 4 dimensional. 


3. Study the definition of Euler angles from the point of view of 
the Wei-Norman equations. In particular explain why it is gener- 
ally not possible to obtain a Wei-Norman representation the entire 
half-line [O, 00 ) in terms of the degeneracy of the Euler angles. 

4. Show that for any square matrix P the set of all solutions of 
PA+A'P ** 0 from a Lie algebra. 

1.2 The and x^ Equations 

Associated with each linear map of R n into \R n are two 
families of linear maps which may be described as follows. Choose 
a basis in ft n and let the original map be represented b> the matrix 
A. Then we easily see that 

y i ‘ £ a ij x i 
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implies that the n(n+l)/2 linearly independent terms of the form 
y±Yj depend linearly on the n(rrt-l)/2 linearly independent terms of 
the form .xjXj . More generally the set of all linearly independent 
p-degree terms y^y^.-y^ depend linearly on the set of all linearly 
independent p-degree terms x^Xj.^.x^* How many linearly indepen- 
dent terms of degree p are there in n variables? If we denote 
this integer by NP then it is easy to see that 


N 


P+ 1 - N p + N P+1 
n+1 n+1 n 


from which an induction gives N P - ( n+P . Thus associated with 

n p 

each map of lR n into is a sequence of maps, the pth one mapping 
N P _wP 

n into n . 

In order to give this family of maps a matrix description we 

<n NP 

need to choose a basis in IK n which is in some way convenient. 

The principle which guides our choice of basis is this: let <x,y> 

be the ordinary inner product 

n 

<x,y> - l x y 
i-1 

If the map of R n into (R n defined by A preserves length, we would like 
n> N n ioN p 

the maps of IK into " to preserve length as well. To achieve 
this we introduce the basis elements 


VO P -(T’-) 

For example if n=p®3 we have basis elements 


n 

1 p ± =p; 

i-i 1 


p ± > o 


x^, /3 x^x 2 , /3x^x 3 , /Sx^xlj, /6 x 1 x 2 x 3 , /SxjX^, x^, /Sx^x^/Sx^.x^ 


If we denote this vector, ordered lexigraphically , by 
choice of basis is such that ( | |x| |«(<x,x>)^' 

I II _ I Ul IP 


2 3 
.[Pi 


then the 


More generally, we have 

P 

<x,y >r = <x 

,[pi 


ipi >]> 


We denote by A K the map, or matrix, which verifies 


Ax 


.> ylP] = A [P1 X [P] 
[Pi 


are covered by the following 


The principle properties of A 1 
theorem. 

Theorem 1 : Suppose we are given A and B. A : (R. n -*■ \(\ and 

B : IK n -*■ lR n . Then A^P^ and B^P^ satisfy • 
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a Ip1 b Cp] 


ill) (A q ) fpJ 


q integer; A q defined 


iv) (A') fpJ = (A fp1 )' 

Proof: i) Clear from definition, ii) Let z-Ay-ABx. Then 

z [p] oA [py P ] BA 

^B^x^-fAB]^. iii) This follows from 
ii) on letting B=»A (or B=A~1 if A is invertible) and using in- 
duction. iv) This follows from the identity <x,y>P«<xtP T y t P ) > 
and <x,Ay> ■ <A'x,y>. 

A second series of maps associated with A are the so called 
compounds of A which we write as A'P^ and define in terms of 


matrices as 


matrix of all p by p minors 
of A ordered lexographically 


Since there are ways to select the rows and ways to 
select the column! 'in a p by p minor of an n by n p inatrix we see 

that A^ is an ( n \ by matrix. The following properties of 
(p) ' p ' 'P' 

A K are well known. See for example [2] or [3]. 

Theorem 2; Let A and B be given; A: !R n ■+• lR n and B: lR n -*■ lR n . 

Then A^ and B^ for 0 £ p i n maps i?i P ^ into i?l p ^ and 
i) I (p) - I 

B ffl 


ii) 

(AB) <P) 

- A^V " 5 

iii) 

(A q ) (P) 

- (A (P) ) <1 

iv) 

(A') (P) 

- <A (P V 


q integer; A^ defined 


We have used two different points of view in defining A lWJ and 
A^ p ' . The construction of A*-* 5 ' from A was described in terms of 
linear maps whereas in the definition of A^P' we used matrices 
exclusively. Alternative approaches are available which give 
A'P) a geometric meaning in terms of skew symmetric forms of degree 
p in n variables. 

These two constructions are specializations of the tensor 
product in the following way. If A: IR n -*• fR n and B:R n -*■ (R n then 
we may identify the tensor product of An and By with An(By)'; i.e. 

An ® By - An (By)' » a ( ny ' ) b ' 


If we consider the linear map of the space of n by n matrices into 
Itself defined by L(Q)=AQB* then L a (Q)-AQA' when restricted to act 
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[21 

on symmetric matrices has A as a matrix representation and when 
restricted to the complementary space of skew symmetric matrices. 

It has as its matrix representation. Thus if we let = indicate 

"similar to" then we have 

J2] 


A 8 A = A' *)A' « 


One can also see that A © A & 
lere are 


0 

0 a< 2 > 

A "contains" A 


and A <3) but 


there are more than 2 symmetry types for a 3 index tensor so that 
AI 3 ] 0 A'^) is only part of A ® A O A. (Check the dimension- 
ality; n(n+l) (rrt-2) /6 and n(n-l) (n-2)/6 does not add up to n^.) 

Now consider a linear differential equation in TR n 
x(t) - A(t)x(t) 

Observe that 

x tpl (t+h)«(I+hA(t)) tpl x Ipl (t)+0(h 2 ) 


so that 


Thus 


x Ipl (t+h)-x lpl (t)-[(I-hA) (t)) Ipl -I]x Ip] (t)+0(h 2 ) 
— * Ipl (t) - (lim [(I-hA(t)) Ipl -I])x [pl (t) 


dt 


h-K) 


(Note that the dimensions of the identity matrices in these equations 
are n and N p respectively.) We define Ajpj to be the coefficient 
matrix in tftis differential equation. 


_d Jp] 

dt X 


(t) - A [p] (t)x [pl (t); 


p**l, 2, 3 , . . . 


Thus the set of all p-degrees forms in (x., ,x 2 » . . . ,x } satisfies 
a linear differential equation with a coefficient matrix which 
is easily derived from A. 

Starting with a matrix equation 

X(t) - A(t)X(t) 

we can make an analogous construction using compound matrices 
(round brackets) . The estimate 

X (p) (t+h) - (I+hA(t)) (p) X (p) (t) + 0(h 2 ) 

leads to 

dt <•=> - 

which we write as 


(lim [(I+hA(t)) (p) -I])X (p) (t) 
h-*0 


JL v<P) 


dt 


(t) 


A (p) (t)X (p) (t); 


P"1 » 2 , . . . ,n 


The special case in which p»n is the basis for well known Able- 
Jacobi-Liouville formula obtained by integrating the scalar 
equation 


dt 


(det X) ■ (tr A(t))det X(t) 


Thus we see that A 


[Pi 


and A 


(P) 


are infinitesimal versions of 
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A^ and A^ respectively. As such, they depend linearly on the 
elements of A. This has some significant implications. 

We also have the infinitesimal version of the tensor product 
reduction given above. It takes the form 


A(*)+(*)A' = I 0 A + A 0 I = 


A [2] ° 


k (2)J 


Lf A 


There are important relationships between A, Ar_i and A/ p \ 
which are more or less clear from derivation. First of all, if 
has all distinct eigenvalues {A^} then the solutions of x(t)»Ax(t) 

consists of a sum of terms of the form aj,e^'^ t . Thus x^ consists 
of products, p at a time, of such terms 
[p] rR (X ± +X +...A k )t 

X “ ZB ij...k e 

Thus the eigenvalues of the ( n+p Mby ( n+p M matrix A, , are the 
/n+p-l\ V P 1 ' P ' IP! 

^ p J sums over distinct (unordered) index sets 

X.+X.+ ...X,; p terms 
i j k r 

The same is true for the case where A has eigenvalues of higher 
multiplicity. Similarly, the eigenvalues of A( p j consist of sums 
p at a time of the eigenvalues of A but in this case the indices 
i,j,..,k must all be distinct. 

A second fact involves the transition matrix $ A (t) which 
satisfies 

$>(t) » A(t)4>(t); <J>(0) “ I 
By the above construction we see that 


and 






(t) 

- $[ pl (t) 

[p] 

A 

(t) 

» $| p) (t) 

(p) 

A 


(Again, the last of these is the Able- Jacob i-Liouville formula if 
p-n.) 


then 


Finally, lf {A^} is a basis for a Lie algebra and if 

I Y ^ 


^ A i ,A j ^ 


m 


[A. ,A 1 " I 
i [p] J [p] i-1 ijX k 


[p] 

That is, the (A. } form a Lie algebra with the same structural 

IP] 
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constants. To see this we need to show that 
[A,B] [p] " lA [pl’ B [p] 3 

but this can be seen from the approximations 
e tA,B, tpl t2 . (< .[A,Blt 2 ) [|>1 

. e tA tp] ,B [p ] t2 

where in all cases the approximations are valid up to and includ- 
ing terms of second order in t. Identical formulas hold with [p] 
replaced by (p) . 

This circle of ideas is of great Importance in the theory of 
representations of Lie algebras; see [4] or [5]. However in con- 
trol theory and differential equations there exist many problems 
where one can use these ideas, and other ideas from representation 
theory, to simplify calculations and to provide insight. A 
particular example is the study of the moment equations for 
stochastic differential equations. See, for example, reference [6]. 
Exercises 



are an A, A,., pair. 

L*J r - 1 

2. .Show that A LPJ is orthogonal if A is orthogonal. What about 

a' , p; ? 

3. Describe in full the decomposition of A 9 A ® A. 

4. Give a definition of A^ p3 for which z ■ Ax implies z^ p3 «A^ p3 x^ p3 
but which does not require A to be square. 

1.3 Matrix Lie Algebras and the Matrix Exponential 


In section 1 we saw that the solution of the differential 
equation 

Cl 

x(t) - ( l u (t)A )x(t); x(0) - x 

i-1 1 

could be expressed for small | t | as 

, . A l 8 l (t) A 2 g 2 (t) A m 8 m (t) 

X(t) ■ ft A -v 


• • • 6 
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provided the form a basis for a Lie algebra. On the strength 
of the theorem of Frobenius, similar statements can be made for 

x(t) = l u (t)f.[x(t)]; x(0) » x 

i-1 

provided the set of vectors {f^(*)} are involutive. There is a 
/ sort of converse question. If the set {A^} does not fo^m the 
basis for a Lie algebra to what extent is it necessary to add 
elements to these sets in order to cover all possibilities? We 
know already that by adding enough elements to {A^} so as to obtain 
a basis. for a Lie algebra we can be assured of a representation 
of the above form. However, it might happen that for 

x(t) - u 1 (t)A 1 x(t)+u 2 (t)A 2 x(t); x(t) eR n 

the smallest Lie algebra which contains A^ and A 2 is of dimension 
n2. Are all of the n^-2 elements which we add in order to get a 
Lie algebra really necessary? 

In 1939 Chow [7] published a generalization of an earlier 
theorem of Caratheodory proving that if some regularity conditions 
hold, then along solution curves of 
m ' 

x(t) = l u. (t)f . [x(t)]; x - x(0) 
i-1 

one can reach the same points as one can along the solution 
curves of v 

x(t) - l u, (t) f , [x (t ) ] + l v.(t)g.[x(t)] 
i=l 1 1 . i-1 

where g^Cx) are obtained as Lie brackets of the f^, Lie brackets 
of these Lie brackets, etc. Thus on the basis of this "reach- 
ability" theorem of Chow we see that no matter how many elements 
we must add to get a basis for a Lie algebra, nothing short of the 
full set will suffice. 

We formalize this discussion as follows. Let B denote any sub- 
space of gJl(n). Let {b) a denote the smallest Lie algebra which con 
tains B. Let C be any subset of G£(n) and let {C}g denote the 
smallest group which contains C. 

Theorem 1 : With the above definitions 

{exp B> g - (exp {b) a }q 

Perhaps the most elementary proof of this result appears in [8] 

After sufficient insight is built up it is frequently possible 
to evaluate {exp {b} a ) g by inspection. The insight comes from a 
handful of special cases and general formulas such as exn A. . 

(exp A)IPJ. The notation tor the principle special casesis l tnis: 

We take the- field to be ^ and let J =( ? . 

-1 u 
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g£(n) = {X : X =* n by n matrices} 
sJ£.(n) = (X : X e gi(n); tr x= 0) 

sofa) = {X : X e gJlfa); X'+X = 0} 

spfa) = {X : Xe g£(n); X'J+JX = 0} 

Matrices satisfying the last condition are often called Hamiltonian 
because they take the form familiar in Hamiltonian mechanics 

R -A’] ; Q " Q,; R = R * 

2 -1 

It is very important to keep in mind that J ■ -I so that J = -J. 

Associated with each of these algebras is a multiplicative 
group of matrices which are defined in a corresponding way 

GUfa) * {X : X is n by n matrix; det X ^ 0} 

SJl(n) - (X : X e Gll(n); det X = 1} 

Sofa) - {X : X e GJlfa); X'X = 1} 

Sp(n) = {X : X e GJl(n); X'JX - J} 

These groups are called the general linear group, the special 
linear group, the special orthogonal group and the symplectic 
group, respectively. 

It is easy' to verify that in any of these cases exp X belongs 
to a particular group if X belongs to the corresponding algebra. 
This corresponds to the following well known facts 

i) exp M is nonsingular for all M 

ii) det (exp M) = exp(tr M) a 1 if tr M ■ 0 ^ , 

iii) exp A is orthogonal if A is skew symmetric since (e ) ■ 

eA » e -A =■ (e*)~l if A * -A'. A' A 

iv) exp A is symplectic if A is Hamiltonian since e Je *»• 

Je J ' A ' J e A - J if A'J+JA - 0. 

Notice that the set of n by n symmetric matrices do not form a Lie 
algebra; alternatively, the nonsingular symmetric matrices do not 
form a group. 

The implication for the study of differential equations is as 
follows. If X is an n by n matrix which satisfies the equation 

X(t) - A(t)X(t) 

Then of course the fundamental solution $ n (t) is going to belong 
to the general linear group. But if A at all points in time 
belongs to one of the above subalgebras of g£(n) then 4>^(t) will 
belong to the corresponding subgrouo of Gl(n). This group-algebra 
relationship provides qualitative information about the solution 
without actually solving the equations of motion. 

To what extent are the above maps of the algebra into the group 
actually onto the group? It is well known that a real nonsingular 
matrix need not have a real logarithm. Thus as far as the real 
field is concerned^ exp does not map g£(n).onto G£(n). However if 
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the field is either the reals or the complexes, then every matrix 
sufficiently close to the identity does have a logarithm in the 
appropriate field and it is easy to see that exp maps a neighbor- 
hood of zero in the algebra onto a neighborhood of the identity in 
the group in a one to one way. 

Exercises 

1. Consider the set of n by n matrices whose column sums are zero. 
Show that they form a Lie algebra. If we denote this algebra by 

L then characterize {exp h}^. 

2. Let so(p,q) denote the set of matrices satisfying 

A'Z(p,q) + Z(p,q)A - 0 
where Z(p,q) is defined by 

£<p,q) ■ q p -i 

q 

Show that this set of matrices forms a Lie algebra and show that 
for all matrices M in exp{so(p,q)} we have 

£(p,q) = M'Z(p,q)M 

These are often called the pseudo orthogonal groups since they 
preserve the pseudo length x'Z(p,q)x. 

l.A Cones and Semigroups 

A semigroup of real n by n matrices is simply a subset of the 
n by n matrices which is closed under matrix multiplication. A 
cone in a real vector space is a subset closed under addition and 
multiplication by positive real numbers. Consider a real Lie 
algebra L in the set of n by n matrices. Let K be a conical sub- 
set of L. In general K will not be closed under Lie bracketing 
but it could be. Let (expK) SG indicate the smallest semigroup 
which contains exp K. As we will see, a number of problems in 
control lead to the question of characterizing {exp K) s( , in terms 
of K. The connection between a Lie algebra and its corresponding 
Lie group suggests analogous relationships between cones in the 
algebra and semigroups in the corresponding group. This kind of 
relationship is illustrated in the following example. 

Example : Let K be the cone in gJl(n) consisting of all n by n 

matrices A such that A'+A is nonnegative definite. Then {exp K}gG 
includes all orthogonal matrices since all skew symmetric matrices 
belong to K. Moreover, all symmetric matrices with eigenvalues 
greater than or equal to one belong to {exp K}gQ by well known pro- 
perties of the exponential map. Thus by appealing to the fact 
that any matrix can be written in polar form M = 9R with 0 orthog- 
onal and R positive definite we see that if for all vectors x of 
unit length | |Mx| |2 = j j QRx j j2 = J |Rx| ^ * 1 then M belongs to 
{exp k) S q. It is easy to see that if |Mx| | < 1 for some x of 
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unit length then we can not express M in the required way thus 
this condition is necessary and sufficient. We conclude that the 
semigroup of "expansive" matrices is the exponential of the non- 
negative definite ones. Likewise, the semigroup of (nonsingular) 
"contractive" matrices is the exponential of the cone of non- 
positive definite matrices. 

This example can be generalized somewhat to give a theorem 
with broader scope. 

Theorem 1 : Let K be as above and let Lp be the Lie algebra of 

matrices satisfying A'P+PA = 0 with P'P = I. Then {exp K f| LpJsG 5 * 
{exp K}gQ f) {exp Lp}^ i.e. the expansive matrices in {exp Lp) G . 

Proof: Given any orthogonal matrix P, the group of matrices sat- 

isfying H'PM “ P has the property that the polar representations 
of each element has both its factors in the group. That is, if 
M ■ e^e^ with e^ orthogonal and e^ positive definite and symmetric, 
then f e^ Pe^ = P, e%e^ =,P. To prove this we note that if 
gRgfi'pgftgRSla, p then e^e^ « Pe“% 'Pe“^P 1 . However the term of the 


which shows that each factor belongs to the given group. 

Now if M has the polar form M - e^e^ and if M belongs to 
{exp K) sg 0 {exp L p ) g then R > 0 and Q and R belong to Lp. Thus £1 
belongs to Lp fi K and so does R. 

Typically the relationship between a cone in the Lie algebra 
and the semigroup which the exponential maps it into is very 
difficult to describe. One problem of this type which has been in- 
vestigated extensively arises in probability theory. Let x Q e IR n 
have nonnegative components which sum to one. Suppose that x(t) 
evolves in time according to 

x(t) - A(t)x(t) ; x(0) - x q 

If A(») has the two properties: 

(i) the off-diagonal elements of A(t) are nonnegative for all t 
(ii) the sums of the columns of A(t) are zero for all t, 

then x(t) will have nonnegative components which sum to one for all 
t > 0. This is equivalent to saying that subject to the above re- 
strictions on A(’) the solution of the matrix equation 

X(t) - A(t)X(t); X(0) - I (*) 

is a stochastic matrix; i.e. a matrix with nonnegative entries 
whose columns sum to 1. The imbedding problem [9] is that of 
determining which stochastic matrices <f> can be reached from the 
Identity along solutions of (*) given only that A (t) must satisfy 

(i) and (ii) . Of course the set of matrices which satisfy (i) and 

(ii) form a cone and the set of reachable matrices form a semi- 


right is a polar decomposition since Pe~^P' is symmetric and 
positive definite and Pe“^P' is orthogonal. Thus by uniqueness of 
the polar decomposition we see that e^ = Pe“^P' and e^ = Pe^P' 
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group. It is not true however that for n > 2 this semigroup con- 
sists of all stochastic matrices. 

In control applications there is particular interest in the 
case of cones of the form 

K » {X : X aA+EBjB^ a * 0; unrestricted} 

i.e. cones which are half jpaces.The first point to make is that by 
virtue of theorem 3.1 we may as well assume that the form a 
basis for a Lie algebra since by adding elements to {B^} to make 
the basis of the Lie algebra generated by {Bj}we do not enlarge 
the reachable set. Moreover, it is also clear from theorem 3.1 
that 

{exp{A,B i ) A } G '2 {exp K} gG S {exp{B i > A } G 
It is more or less clear that if e^ is periodic then 
{exp{A,B i } A ) G = (exp K} gG 

and Jurdjevic and Sussmann [10] have shown that this is also true 
if e^t is almost periodic. 

k 

It is also true that Ad^B. belongs to the Lie algebra generated 
by the B^'s then 

exp K « e^CexpfB^J^g 

For a proof and some generalizations see the thesis of Hlrschom [11]. 
Exercises 

1. Calculate {exp N} sg where N is the cone 

N - {X : X «[ a b l ; X+X* < 0} 
ic — aj 

2. It is well known that the elements of fy^(t) are nonnegative 
for all t 0 if A(t) itself as elements which are nonnegative off 
the diagonal -- the diagonals may have any sign. Give an example 
which shows that {exp K} sg is not the entire semigroup of square 
matrices with nonnegative entries if K is the cone of A's described 
above. (Find a matrix with positive entires and negative deter- 
minant. ) 

3. Explore the relationship between #2 and the Imbedding problem. 


II. INPUT-OUTPUT SYSTEMS 

In this chapter we consider input/output systems which can be 
represented by a pair of equations of the form 

X(t) - (A+ l u (t)B )X(t) ; y (t) = C(X(t)) (*) 

i=l 

Here X is an n by n matrix as are A and B^, B 2 , ..., B^; the map 
C is subject to certain restrictions to be described later. The 
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differential equation is said to be of the "right invariant type" 
because a multiplication on the right by a fixed element of G£(n) 
gives an equation 

m 

X(t)M = (A+ l u. (t)B . )X(t)M 
i-1 1 1 

which is again of the same form and with the same coefficient 
matrices. This is to be contrasted with an equation such as 

m m 

X(t) - (A+ l u.(t)B.)X(t)+X(t)(I>f l u. (t)E ) 
i=l i-1 

which does not have this invariance property. The basic idea is 
to understand as well as possible the properties of input-output 
maps which can be represented by equation (*). We will study 
controllability, observability and state space isomorphism 
theorems. 

2.1 Controllability 

If u. is an m-dimensional piecewise continuous function of 
time and if t^ is a nonnegative number, then we given the pairs 
(u^jtj) a semigroup structure by defining 

( u l,tl) o (u 2 ,t 2 ) - (u 1 |u 2 ,t 1 +t 2 ) 
whereby u^|u 2 = u^ we mean 

u,(t) - j u l <t)! 0<t<t l 

This is the concatenation semigroup with due regard for the domain 
of definition of the functions being concatenated. We denote it 
by u". 

Consider the time invariant control system 

x(t) - f[x(t),u(t)] ; x(t)e^ n (**) 

with f well enough behaved so as to guarantee the existence of a 
unique solution for each starting point x e tK and each 
(u,t) £ U m . Let T be the semigroup of one to one continuous maps 
of fl( n into JR n with composition as the semigroup operation. Then 
the control system (**) defines a homomorphism of u into T a . We 
denote this homomorphism by $ and, by analogy with automata theory, 
call the image of U m under <p the Myhill semigroup of the system. 

The main thing which is special about bilinear systems is 
that the Myhill semigroup is easily identified with a matrix semi- 
group. That is, if we have a system in 5^° 

x(t) - (A+ l u,(t)B,)x(t) 
i-1 1 1 

then the matrix equation 
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. m 

X(t) = <A+ l u (t)B.)X(t); X(0) = I 

i«l 

describes the relationship between u” 1 and T n — each matrix being 
associated with an element of T n in the standard way 

M V-» f (x) - Mx 

If A is absent in the above equation then it is clear that 
the Myhill semigroup is actually a group since if u(«) £ U™ steers 
the system from I to M at time t^ then v(*) e U 111 and defined by 

v(t) - -u(t.-t) 

-1 

steers the system to M at t ■ t^. 

Given an initial state x , the set of states reachable from 
x can be identified with the set of points which x Q is mapped 
into by the various elements of the Myhill semigroup. That is, 
the Myhill semigroup acts on the state space 

S : l -*■ l 

The reachable set from x is the "orbit" through x q defined by 
this action. 


We now give various examples of reachability theorems. 
Theorem 1 : There exists a control which steers the system 

X(t) - ( l u i (t)B 1 )X(t) 
i“l 

from X to X n in time ti > 0 if and only if X,X belongs to 

Proof : This is an immediate consequence of Theorem 1.3.1. 

It is also easy to see that if A belongs to {B^}^ then the 
reachable set for 

m 

X(t) = (A+ l u.(t)B )X(t) 
i=l 

is just the same as it would be if A were absent. 

Notice that the reachable set does not depend on t^ as long 
as t^ is positive. If A is absent and if one restricts the con- 
trols to be bounded, say |ui(t)\ < 1 then all points of the above 
form are reachable after a suitably long time but the time re- 
quired will depend on the point to be reached. 


A second result which we want to use in a moment is this. 
Theorem 2 : The reachable set at time t for 

. m 

' X(t) - (A+ l u i (t)B i )X(t); X(0) - I 
• i e l 




0 

0 



and 
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with A square is 

•** 

R(t) = e C {exp 

Here (Ad 7 ,B^) A ^- n ^^ cates the smallest Lie algebra which contains 
{B 1 } A ana is closed under the action of Adg<. 

Proof : See reference [8], Theorem 7. 

We can combine theorems 1 and 2 in an obvious way to get the 
following more general result. 


Theorem 3 : 

X(t) - 
where 


The reachable set at time t for 
m q 

Ax(t)+ l u i (t)B i X(t)+ l v i (t)C i X(t) 


i«l 


i“l 


A 

0 

: B = 

0 0 

• c - 

f° c il 

0 

0 

* i 

o bJ 

* i 

L0 0 J 


with A and B^ square is 

R(t) » exp At{exp{Ad~, B^ 

Finally, one can get additional results by using a nice lemma 
of Jurdjevic and Sussmann [10]. 


Theorem 4 : The reachable set for the system at time t start- 

ing from x=0 at t=0 and governed by 
m p 

x(t) - (A+ l u 1 (t)B i )x(t)+ I v 1 (t)g_ L ; x ( t) e IR n 
i«l i-1 

is the vector space generated by where k indicates powers 

and is a basis for the Lie algebra generated by {A,B^}. 

Proof : To begin we observe that if x^ is reached at t°t^ starting 

from x-0 at t=0 using the control (u,v) then the control (u,av) 
steers the system to ax^ at t=t^. Also, we know that if we write 
the system as 

-A [*<‘>1 = 

dt L 1 J 

r 

then the reachable set has a nonempty interior in 


A 0 
0 0 


+ u jL (t) 


0 

0 


+ v. 


10 0 


-x(t)-] 
.. 1 . 


where 


R“ (exp {A, B, G}^} G 


O' 

1 


a 

A 

0] 

B = 

[ B i 

0] 

G » 

f° *il 


.0 

oj ’ 

i 

[o x 

0] * 

i 

lo 0 J 


There exists a nonzero control of the form (0,v) which steers the 
system back to zero at time t«*t^ from 0 at t=0 — use u»0 and 
invoke standard linear theory. According to lemma 6.1 of [10] 
we obtain on taking perturuations about this control an open set 
in R containing 0. Using the cone property mentioned in the first- 
sentence we see that the reachable set is a vector space. Lie 
algebras tell us which one. 
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A particular problem in controllability theory which has re- 
ceived a good deal of attention is 

x(t) - Ax(t)+u(t)b<c,x(t)> ; x(t) £ R n 

where u(*) is a scalar, and b is a column vector. Of course the 
linear system 

£(t) * Ax(t)+bv(t) 

is controllable in ii^ n if and only if (b,Ab,..jA n H>) is of full 
rank. If the linear system is controllable it might be supposed 
that the bilinear one is also controllable since if v is a control 
which drives the state of the linear system from x to x. then the 
control 

u(t) - v(t)/<c,x(t)> 

drives the bilinear system from x 0 to x 1 . This argument has the 
obvious fallacy that <c,sc(t)> might vanish along the trajectory leav- 
ing u(t) undefined. In particular, if x(0) ■ 0 then of course x 
vanishes identically for all future time. Thus the most one could 
hope for is that any nonzero state could be steered to any nonzero 
state. It turns out that this is too much to hope for also. A 
simple pair of examples which illustrate that no amount of work can 
salvage this argument and which at the same time suggest the nature 
of the problem are these. 

Consider the system 


x^t)' 

a 

0 1 


*l (t) 

+ u(t) 

o 

o 


*2 (t) - 


1 0. 


_x 2 < t ). 


10 lj 

_ x 2 (t) - 


which has the form 

x(t) = Ax(t) + u(t)b<c,x(t)> 

with [A,b,c] a minimal realization of s/(s^-l). However for any 
given x q there exists x^ such that x^ is not reachable from x Q 
because regardless of k, the off-diagonal elements of (A+k(t)bc) 
are always positive so that 4>(t,t Q ) , the transition matrix, has 
all entries nonnegative for t > t Q . Thus if x(0) has nonnegative 
entries for all t > 0. This argument shows that the system is not 
controllable. 

Consider the system 


, 

M 

ft 
%-/ 
J 

a 

■ 0 1 

x 1 (t) 

+ k(t) 

0 O' 

Xj^Ct) 

Lx 2 ( t )_ 


-1 0. 



0 1 

x 2 (t). 


which has the form x(t) - Ax(t)+k(t)bcx(t) with [A,b,c] a minimal 
realization of s/(s^+l). In this case we see that the system is 
controllable on Rr-{0}. (See reference [12] for details.) 

Exercises 

1. Show that the Myhill semigroup for the linear system 
x(t) = Ax(t)+bu(t); x(t) e (R n 
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can be identified with the multiplicative matrix semigroup 

At 


{X : X - 


t > 0; x e span(b ,Ab, . . .A n H>)} 


2. Consider a bilinear system 

x(t) ■ Ax(t) + u(t)Bx(t) 

on 0\ n -{0}. Is it true that if there exists any state x such 
that all points in - {0} are reachable from x then ail states 
have this property? ‘ 


3. Consider the linear system 

m 

X(t) - A.X(t)+X(t)A + l u. (t)B . 

i-1 

Here X(t) is an n by q matrix and A. and A are n by n and q by q 
respectively; the B, are n by q. Show that the Myhill semigroup 
equation can be identified with 


_d 

dt 


s x (t) 


s 3 (t) 

s 2 (t) 


-( 


0 

A 


m 

l 

i=l 


u ± (t) 


0 "ihP 1 

0 0 J / 0 


(t) 


s 3 (t) 

S 2 (t) 


Show that the reachable set at time t for the Myhill equation is 
exp At*exp{Ad A ,B i } 


2.2 Observability 


We now consider systems with an output 
m 

X(t)-(A(t)+ l u.(t)B.(t))X(t); y (t) - C(X(t)) ; X(t) e G£(n) 
i«l 

The exact nature of the output map is not essential. We give the 
output space no structure — it is just a set. The critical 
assumption is that there should exist subgroups H. and H r of Gfi.(n) 
such that C(X^) = C(X 2 ) if and only if 


H 1 X 1 H 2 " X 2 


for some in and some H 2 in H r . Under this assumption C(X) 
identifies X to within a multiplication on the left by an element 
of Hjj, and a multiplication on the right with an element of H r . 

We call systems of this form homogeneous. 


In such a set up, the observation of y, even over a period of 
time, can at most determine X to within a right multiplication by 
an element of H . Thus we might as well regard the system as 
evolving on the coset space GS,(n)/H r . Whether or not the obser- 
vation of y and the knowledge of u over the interval [0,®) serves 
to identify uniquely an element of X/H r as a starting state is then 
subject to investigation. 

Theorem 1 : Consider the above system with H r and give:. Let R 

denote the set of X's reachable from I. Suppose that R is a group. 
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Then two points XiH f and X2H r in Gi(n)/H r give rise to the same 
input/output map if and only if for each R^ in R there exists 
Hj^(R) in such that 

R _1 H 1 (R)RX 1 H r - X 2 H r 

If we denote by P the subgroup 

P - {X : R _1 XR e H £ ; V R e S} 

then any two elements of the form XjH r and P^XjHp with P]^ in P are 
not distinguishable. 

Proof : If XjH r and X 2 H r are to be indistinguishable as starting 

states we must have 

%RiXiH r - H 4 R i X 2 H r 

for all in R. Since and H r are groups and since R is a 
subgroup of Gflji) , the above condition is equivalent to asking that 
for each in R there exist H^(R) in Hjj, such that 

R i\ (R i )R i X l H r " X 2 H r 
The remainder of the conclusions are clear. 

Exercises 

1. Assuming that the evolution equations are of the form 

m 

x(t) - Ax(t) + l u i (t)B i x(t); y (t) «* H £ x(t)H r 

with i " 1 

H £ “ {exp{C 1 } A } ( ,; H r - {exptDj^ 

give an observability condition in terms of Lie algebras. (See 
ref. [8] for some results along this line.) 

2. Apply the results of problem 1 to the bilinear problem 

m 

x(t) «* Ax(t) . + l u.(t)B x(t); y (t) - c[x(t)] 
n i-1 

by identifying (R with the n dimensional affine group modulo 
Gl(n). 


2.3 Isomorphic Systems 

The two scalar realizations 
x(t) « x(t)+u(t)x(t) ; y(t) - x 2 3 (t); x(0) - 1 

and 

z(t) - 3z(t)+3u(t)z(t) ; y (t) » z(t); z(0) - 1 

realize the same input-output map. They are each controllable on 
(0,») and any two reachable states are distinguishable. They are 
related by the automorphism of the multiplicative group (0,°°) 
defined by ^ 

■ x 


z 
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Thus despite the apparent differences between these two realizations 
they are closely related. The following theorem describes a gen- 
eral result of this type. 

Theorem 1 : Consider the two homogeneous realizations of the same 

input-output map 
m 


X(t) - (A+ l u (t)B )X(t); y(t) - c[X(t)] 

i-1 1 

m 

Z(t) = (F+ l u (t)G )Z(t); y (t) - h[Z(t)] 
i-1 

which evolve in GJJ-(n^) and G£(n 2 ) respectively and which have 
reachable sets from^the identity, & and R, which are groups. 
Suppose H£, H and H^, H r are given subgroups of Gi(n^) and G££n 2 ) 
respectively such that c and h are one to one on and Hj,RH r 

and such that the systems are observable on RH r and ^H r . Finally, 
suppose that there is no normal subgroup of R which has a non- 
trivial intersection with R fl H r and^the same for R and fi r . Then 
there exists an isomorphism <}> : R -*■ R such that 


At. 
4>(e ) 


Ft 


B i t 
<}>(e ) 


V 


Proof : Suppose that there exists a control (u,T) in U m which takes 

the first system from I to Dj^I and takes the second system from 
I to I. Let D denote the set of all such points. By virtue of 
the observability hypothesis we see that D is a subset of H r and, 
in fact, a subgroup of H r . Moreover it is easily seen to be a 
normal subgroup of R and hence of Rft H r . By hypothesis D is 
trivial. This implies that there is a one to one correspondence 
between points in R A H r and R A H r which is, in fact, a homomor- 
phism. 

We see that R and R are both homomorphic images of U™. If a 
pre-image of R in ll” 1 is in U R then what is the image under the 
action of^the second system of Ur? It is clearly R or else a sub- 
group of R. If it is a subgroup then the subgroup must contain 
R f\ H r but there is^a one to one and onto correspondence between 
R/R A H r and R/R A H r and an isomorphism between R A H r and R A d r 
Using the properties of the system maps we see that the above map 
must be onto R and thus it establishes an isomorphism. The re- 
maining claims then follow. 


Exercises 

1. Develop the Lie algebra analog of Theorem 1. 

2. Apply the above results to bilinear systems of the form 

m 

x(t) - Ax(t)+ l u 1 (t)B 1 x(t); y (t) - cx(t) ; x(0] - x q 


See P. d'Allessandro , A. Isidori and A. Ruberti [13] and Brockett 

[143. 
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III. OPTIMAL CONTROL 


This chapter is quite brief due to the absence in the liter- 
ature of results relating specifically to the Lie group case. We 
discuss only two problem areas — the question of existence of 
optimal controls in the bang bang case and questions centering 
around minimum "energy" transfer. 

3.1 Bang-Bang Theorems 


It is well known that under very weak assumptions on the 
matrices A(*) and B(*) the linear system 

x(t) = A(t)x(t)+B(t)u(t) ; x(0) = given 


with controls constrained by 

!« i (t)| “ 1 

has a set of reachable points at any time t^ > 0 which is the 
same as the set of points reachable with the constraint relaxed to 

l-jCOl < 1 

This is called a "bang-bang theorem" because the controls u^ need 
only take on their extreme values and not intermediate ones. Some 
generalizations of this have been investigated by Krenner [15] and 
Sussmann [16]. We examine only an easy case here. 


Theorem 1: Let X satisfy the differential equation in GL(n) 


m 


X(t) = AX(t) + ( l B 1 u 1 (t))X(t) 
k i=l 2 

Then if [Ad A (B ),B.] is zero for all i and j and k«*0,l,...n -1 then 
the set of states teachable at time t for |u^(t) | ■ 1 is the same 
as the set reachable for | u^Ct) | * 1. 

Proof : In view of the commutativity condition we can express the 

solution of the given equation as 


X(t) 


At 

e e 


tt m 

l 

J 0 i=l 


-Ao A O , . . 

e B^e u i (a)da 


X(0) 


See [8] Theorem 7 for details. Now since the bang-bang theorem 
is valid for the linear system 

F(t) = £ e At B e At u. (t) 

i=l 
At F(A 

and since X(t) = e e W we see that it holds for the systems de- 
fined here as well. 


Exercises 

1. The solution of the scalar differential equation 

x(t) - u(t)x(t)+v(t) 
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is j u(a)da t j u(p)dp 

x(t) ■ e ^ x(0) + [ e 0 v(a)da 

J 0 

Is the bang-bang theorem valid if we regard u and v as controls? 

2. Is the bang-bang theorem valid for the pair of scalar 
equations 

z(t) ■ u(t)z(t) 
x(t) - (u(t)+v(t))x(t) 

3. Show that the bang-bang theorem is valid for 

x(t) - u(t)x(t) 
y(t) - -y(t)+u(t) 

Generalize this result. 


3.2 Least Squares Theory 


Under the assumption used in the previous section we can 
develop a satisfactory theory for minimizing 


f" ? 2 

U - l u 
J 0 i**l 


<t)dt 


(*) 


subject to the constraint that the system 

m 

X(t) - (A + [ u.(t)B >X(t) 

i-1 

should be transferred from the state X at t»0 to the state X 

O X 

at t-t r 

Theorem 1 ; Let X(t) satisfy the G&(n) equation (*)._ Suppose 
that [Ad^B ,B . ] = 0 for all i and j and k=*0,l,2, . . .n -1. Suppose 
that A 1 J 


XiXq 1 ee ^^{exptAd^jB^^}^}^ 


Then there exists a control u(*) which steers the system from X q 
at t=0 to at t=t^ and minimizes n . This control is the same 
as the control which steers the linear system 

F(t) = l e~ Kt B ± e kt }i ± (t) 
i=l 

-At -1 

from 0 at t<“0 to £n(e 4 XjX ) at t»t^ and minimizes n where In 

denotes the real solution of 

M “ At l -1 
e « e X,X 
l o 


which results in the smallest value of r 
of the form 

u^t) - tr(M 1 e” At B 1 e At ) 


The optimal control is 
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for some constant matrices M^. 

Proof : As in the proof of the bang-bang theorem we see that 

X(t) = e At e F(t) 


where F(t) satisfies 

F(t) = l it) 

jL»2. 

From this point on everything follows from standard linear theory. 
See [17], section 22. 

Exercises 

1. Consider the system 

x(t) = x(t)+u(t) 
y (t) - u(t)y (t) 


Suppose we want to steer this system from (a,B) to (y ,6) in t^ 
units of time and to minimize 


n - [ 1 u 2 (t)dt 
•'o 

If 6/8 is positive this transfer is possible and the u(*) which 
achieves the optimal is of the form ae^b. Generalize Theorem 1 
in such a way as to capture this example. 


2 . 


If B.. and B 2 commute, describe the solutions of 

* , Vi Vi. „ 

II (e e ) = N 

i-1 


IV. STOCHASTIC DIFFERENTIAL EQUATIONS 

Stochastic processes on spheres has been of interest in 
physics for some time. Debye [18] in his book on statistical 
mechanics gives one application of S^ stochastic processes. 

Nuclear magnetic resonance phenomena account for some more recent 
interest in diffusions on S z . See Chapter 15 of the recent text 
[19]. The French mathematical physicist Perin wrote a classical 
paper [20] on diffusion on S0(3). Recent interest in physics re- 
garding models of the type under study here is discussed in Fox 
[21]. Transmission of electromagnetic waves through random media 
leads to stochastic processes on the symplectic group — distance 
playing the role usually assumed by time. Tutubalin [22] can be 
consulted for recent results and references. Carrier [23] has 
examined an equation of this general type in connection with a 
gravity wave propagation problem. One can think of this study as 
a stochastic process on the two dimensional symplectic group. An 
engineering problem for which the theory is potentially interesting 
is the randomly switched electrical circuit. 
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4.1 Bilinear Stochastic Equations 


In this paper all stochastic differential equations are to be 
interpreted in the Ito sense. All Wiener processes are of unity 
variance and Wiener processes with distinct indices are assumed to 
be uncorrelated. The reader is encouraged to study Clark [24] for 
more details on stochastic calculus. 


Under what circumstances does the Ito equation 

m 

dx(t) = Ax(t)dt + J dw^CtjBjxCt) 

i=l 


(*) 


evolve on the manifold defined by x'Qx « constant? If we expand 
to second order keeping in mind that dw^dwj ® i S^jdt we 8 et 

m 2 m 

dx'Qx = x'(A'Q+QA)xdt + £ x'(B^Q + QB )xd»+ £ x'B'QB.x dt 

i=l ^ i"l 1 

Thus in order for the derivative of x'Qx to vanish we require 

1 m 

A'Q + QA+ -r l B‘QB - 0 
2 i-i 1 1 

and also we require 


BjQ + QB 1 « 0 

We see that the drift term A needs to be "corrected" by a term 
coming from the white noise. For example, if we want equation (*) 
to evolve on a sphere then A is not skew symmetric as it would be 
in the deterministic case but rather it has a correction term 
whose size depends on the B^ On the other hand, the must be 
skew symmetric. 4 


In order to evolve on the symplectic group it is a skew 
symmetric form which must be preserved. Repeating the above with 
Hamiltonian matrices gives rise to the conditions that and 
1 2 

A + ^B^ should be Hamiltonian. 

Exercises 

1. Show that the Ito equation 


dx, dx, 

1 i 

dx, dx. 


a 0 
Y 5 


X 1 x 2 
x 3 x 4 


dt + 


x 3 X 4 


dw. 


dw. 


dw, -dw. 


evolves on the special linear group S£(2) if suitable restrictions 
are placed on a, 6, y, <5. 


2. Generalize the previous problem to SJt(n). 


4.2 The Moment Equations 


Associated- with the stochastic equation 

m 


dx(t) ■ Ax(t)dt + £ B^x(t)dw i (t) 

i*l 


(*) 
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is a family of higher order equations analogous to those given in 
section 1.2. These are the equations for x^Pl. In order to dis- 
play their form it is necessary to work out section 1.2 using the 
Ito calculus. As an alternative, suggested to me by Martin Clark, 
one can convert (*) into an analogous Stratonovich equation, use 
the ordinary calculus to get the x^P^ equation, and then convert 
back to the Ito form. This idea is particularly attractive in the 
present setup since we have the deterministic results already. 

The Stratonovich analog of (*) is simply 
. m . m 

ibc(t) - (A- -=■ l B^)x(t)dt + l B x(t)&?.(t) 

Z i-1 1 i-1 1 X 

where cfc indicates Stratonovich differentials. Applying ordinary 
calculus we get 

& [p] (t) - (A- | l B 2 ). ,x [p] dt+ l Bj p] x [pl (t)dw (t) 

* i-1 1 lpJ i-1 

Now if we want to convert this back to an Ito form we must correct 
the_ drift term to get 

dx Ipl (t) = [(A- \ l B 2 ) tp] + l (Bj p, ) 2 lx fp] (t)dt+ l B [pl x tpl (t)dw (t) 
i-1 i-1 i-1 

We can easily take expectations to get the moment equation 

(^x tp] (t))-[(A- y £b 2 ) [p] + [ (B{ pl ) 2 ]^x [p] Ct) 

i-1 i-1 

Notice that the apparently more general^equation 

dx(t) » Ax(t)dt + l B i x(t)dw 1 (t)+ l e i dw 2 (t) (**) 

i-1 i-1 

is covered by these equations as well. To see this we let 

J - [J] 

then x satisfies an equation of the form 

dx(t) - Ax(t)dt + l (5 +C )x(t)cw (t) 

i-1 1 

There are many papers in the literature which analyze the stability of 
of these equations under various assumptions — particular emphasis 
being placed on the case p=2. See, e.g. [25]. In reference [6] it 
is shown that under a suitable hypothesis all the moment equations 
are stable. 

Exercises 

1. Show that in the scalar case the moment equations for 

dx(t) - a(t)x(t)dt + 0(t)x(t)dw(t) 

are <?x P (t) - [p'a(t)- y B 2 (t))+ \ pV(t)) <fx p (t) 

Notice that if a and S r 0 are constant then it can never happen that 
all moment equations are stable. 

2. A problem of interest in geophysics leads to the stochastic 
equation 
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dx^(t) 0 dt dx^(t) x^(0) 1' 

{^(t) -dt+edw(t) 0 . _dx 2 (t)_ ’ X 2 ( 0 )_ 0 


Show that the autocorrelation iSj for small approximated by 
^x 1 (t)x 1 (T) = e (£ 2 /^)(t+T) e -(e /4 >^ t-T > cos ( t _- > ) 

(See Carrier [23]). 


A. 3 Fokker-Planck Equations 


Associated with the Ito equation 

m 

dx(t) - Ax(t)dt + £ dw B.x(t) 

i-1 

is the formal Fokker-Plank equation 


fe * I tr( 1 l 1 B i xx ’ B i’ 5^ ilj ■ 0 


However, if x evolves on a manifold then this equation will not 
be especially useful unless the redundant variables are eliminated. 

In order to carry out this reduction it is necessary to coordinatize 
the manifold in some natural way. This coordinatization necessarily 
proceeds in a case by case way. To illustrate we work out four 
cases on the two-sphere S . 


Consider the stochastic equations (Compare with McKean [26] 
who considers case b, case a being classical.) 
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We introduce polar coordinates according to figure 3. 

The Fokker-Planck equations corresponding to the above cases are 
then - 


f!_ - I <-l 

l 3t 2 v sin<J» 34> 


3 , * 3 , 

sin4> ^ + 


r 3 - . I (JL 

l 3t 2 v sin 4 > 34> 


3 J X 3 , 

sin * 3$ + 


r - A ( Jk 

l 3t 2 'sin 4 > 34> 


3 < A 3 

sin4> v 


sin 2 4> 

30 2 

1 


2 

tan 4> 

— ) 
36 2 

1 

4, 


) ]p(t, 4 >» 6 ) - 0 


(a) 

(b) 


tan 4 > 

3 ,2 


[■§■£ “ ~2 (sin© 3 ^ + cot4>cos0 + gg-]p(t,4),6) ■ 0 


(d) 


The idea behind the derivation of these equations is that 
each of the three generators 


0 

-l 

1 0 
0 0 

> 

0 0 1 
0 0 0 

» 

0 0 0 
0 0 1 

0 

0 0 


-10 0 


0-10 


can be associated with a first order partial differential operator 
which describes the effect of a drift around the corresponding axis 
of rotation and also with a second order partial differential 
operator which describes the effect of a diffusion around the 
corresponding axis of rotation. The derivation of these operators 
is an exercise in differential geometry, however the following in- 
sight is useful. 

On a manifold with a Riemannian metric (g..(x)), the Laplace- 
Beltrami operator [27] 3 



/det(g i j (x) ) 


serves as the Laplacian 
constant conductivity. 


: fjj /.totOsyCx)) fjj 

, in that the basic heat equation, assuming 
is 


( ft - j V Z) 4>(t,x) - 0 

2 

On S , in terms of the given coordinates, the usual metric is 


(ds ) 2 


■ [d4>, 



one sees easily that case a above corresponds to the heat equation. 

As for case b, it is obtained from case a by removing one of 
the generators — the one which corresponds to a ^iffusion about 
the x^-axis. This is equivalent to subtracting y (32/3©2) from 
the operator appearing in case a. 



-47- 


Case c is obtained in an analogous way. We must add a drift 
term to the operator appearing in b corresponding to a rotation 
about the X 3 ~axis. Thus we add a (d/36) term to the operator 
appearing in b. 

Case d is the most degenerate of all in that there is now only 
diffusion about one axis. There is a (3/36) drift term as in 
case c together with the operator which corresponds to diffusion 
about the x^-axis. 

It is of some interest to note that all these operators are 
studied in quantum theory. See Rose [28], appendix A. 


Exercises 

1. Consider the stochastic equation 


■dxi' 


- h dt -dw 0 

t 1 

dx_ 


dw - t dt dt 

k 


l — 
o 

i 

a. 

rt 

O 



x^O+x^W-x^O)-! 


2 2 2 

Show that it evolves on the manifold defined by x^+X 2 ~x^=l. Intro- 
duce coordinates in this manifold and work out the Fokker-Planck 
equation. Is there a limiting distribution? 


2. Show that the moment equations associated with each of the 
four cases analyzed here are stable, (see [26]) 


4.4 Calculation of Diffusion Times 


We continue with the analysis of the four cases of diffusions 
on spheres, now with a view toward determining, if possible, a 
complete solution to the Fokker-Plank equation. In cases where 
that proves too difficult we look for some measure of the relax- 
ation time of the process. 

2 

To begin with, the standard S diffusion (case a above) leads 
to the Fokker-Plank equation 

v 2 P( t,x) - o 

2 

Where V is the usual Laplacian on the sphere. It is, of course, 
well known that the eigenvalues of the Laplacian on the sphere are 
n(n+l) , n a 0,l,2,... with the nth being of multiplicity 2n+l. Thus 
the general solution of the above equation starting from the singular 
distribution concentrated at 8 » <f> * 0 is 

P(t,6,$) m l l P ,(cos4>)e ik0 e n ^ n+1 ^ t 
n-1 k=-n 

where P ^ are the spherical harmonics. We also see that the 
eigenvalues are a measure of the speed with which the density 
approaches steady state. 
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On the basis of this Green's function one can, of course, 
express the general solution of the Fokker-Plank equation in terms 
of its initial value. Thus we have, in terms of the spherical 
harmonics, a complete solution to the Fokker-Planck equation. 

This is classical. 


On the other hand, it is possible to be almost as explicit in 
the other cases as well. This comes about because the 2n+l 
equations for the coefficients of the spherical harmonics of the 
form P n ^(cos )e^ e k-0,+l,...±n are decoupled from those corres- 
ponding to P (cos )e*k“ f or n ^n'. Thus the solution of the 
n ic 

Fokker-Planck equation reduces to a sequence of linear differential 
equations; the nth entry in the sequence being a coupled set of 
2n+l equations. It happens, however, that there is a simple 
connection between the moment equations of section 4.2 and the 
equations for the coefficients of the spherical harmonics. We 
describe this for the s2 situation but similar results hold on 
spheres of any dimension. 

*) fo 1 

For an equation x is a 3-vector and x lp is of dimension 

(p+1) (p+2)/2. The equation for x^Pl includes all linearly indep- 
endent p-forms in x; thus it includes (p-l)(p)/2 terms of the form 


, 2. 2, 2. [p-2] 

(x 1 +x 2 +x 3 ) x l v 

Hence we can partition x^Pl into two parts of dimension (p-l)p/2 

and (p+l)(p+2)/2 - (p-l)p/2 » 2p+l, respectively according to 

whether the components have a factor of x2+x^+x^ or not. Now of 

2 2 2 1 1 J 
course x^+x.+x- ■ 1 so that the components which do contain this 

factor can be thought of as moment equations of a lower order and 

hence they evolve independently of the second part of the equation. 

On the other hand, the 2p+l components which do not contain 

2 2 2 

Xj+Xj+x^ as a factor evolve independently as well. Collecting 
these facts we see that the moment equations have the structure 




dt 


<?x l|,J (t)« 


-a 6 0 


<5+2 


P-2 

0 


S x^ pl (t) 


where 6 is zero or one depending on whether p is even or odd. 

The dimension of A is (2p+l) by (2p+l) and the coefficients of 
the spherical harmEnics of type P n ki n fixed, k*0,±l,±2,...±n are 
governed by the differential equation 

y(t) ■ A v(t) 

• P 

Thus the spectrum of the operators 



i=l i-1 

which were derived in section 4.2, governs the relaxation time of 
the process. ^In case a above we have already commented that the 
spectrum is -jCnCn+l)) with the nth term being of multiplicity 
2n+l. In case b there is less diffusion and one would expect the 
relaxation to be slower. This is the case; a calculation shows 
that the first few entries of the spectrum compares with case as 
follows. 




. . . case a 
. . . case b 


Finally, we remark that examples b, c, and d are specific 
cases of the hypoelliptic operators of Hormander [29]. 

Exercises 

1. Consider the linear stochastic equation 


'dx 1 (t)‘ 

r- 

dx 2 (t) 

- 

approximation 

"dx^t)" 

- 

dx 2 (t) 

m 

_dx 3 (t)_ 

L 


1/2 1 
-1 - 1/2 


1 

dw (t) 

dt+ 

1 

J 

_dw 2 (t)_ 


x(0) - 0 


as an approximation to the first two components of the S 

1 ^ 


equation 


-dt 

-dw. 


dt 

1 


dWj 

dt dw, 


-dw 2 -dt 


x^t)' 

x 2 (t) 

Lx 3 (t)J 


x(0) 


Compute the second moment in each case and compare. 

2 

2. Consider the stochastic equation on S defined by 


"dx^Ct)" 


-dt/2 dw x 0 ~ 


X 

H* 

ft 

dx 2 (t) 

9* 

-dw^ -(l+p)dt/2 pdw 2 


x 2 (t) 

dx 3 (t)_ 


0 -pdw 2 0 


x 3 (t) 


Find the first few eigenvalues of corresponding Fokker-Planck 
operator as a function of p. 


V. STABILITY THEORY 

In the study of ordinary differential equations on Lie groups 
both linear and nonlinear nroblems are of interest, however in 
these notes we discuss linear problems only. Of course the most 
common stability problems encountered in control concern the 
general linear group. However in the study of specific applications 
other groups may occur. For example, in the case of problems 
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arising in classical mechanics Che symplectic group plays a major 
role. Moreover since tensoring will typically transform a system 
evolving in G5,(n) into one which evolves on some subgroup of GL(q) 
is desirable to take a general point of view. 

5.1 Stability of the x^ Equations 

The following theorem is an obvious consequence of the cal- 
culations in section 1.2. 

Theorem 1 : The null solution of the system 

x(t) » A(t)x(t) 

is stable (asymptotically stable) if and only if the null solution 
of the equation 

y (t) » A^(t)y(t) 

is stable (asymptotically stable). Moreover if all solutions of 
the first equation are bounded by x(t)| < Me - ^b then all solutions 
of the second are bounded by Jy(t) < M^e"P^b. 

When combined with standard estimates this theorem can give 
very precise information about high order systems which are either 
in the form of y(t) *» A^ p ^(t)y(t) or else in the form 

y(t) - A^ p j(t)y(t) + D(t)y (t) 
with D(t) small in some sense. 

Example : We know from Liapunov [see e.g. [30]] that all solutions 

of the Sp(2) equation 

[\(t)] f 0 1 x^t)! 

|x 2 (t)J " j-^t) Oj [x 2 (t)J 

are bounded if p(*) is pointwise nonnegative, periodic of period T 
with positive average value and with 

f p(t)dt < 4/T 

® [ 2 ] 

Thus we see that all solutions of the x equation 

yi (t) l [ 0 1 °1 [yi(t) 

y 2 (t) - -2p(t) 0 2 y 2 (t) 

y 3 (t) 0 -p(t) 0 y 3 (t) 

are also bounded under the same hypothesis. (Here we have taken 
y 2 ■ 2x| instead of /2 x^ .) A change of basis puts this equation 
in a more symmetric fonn 
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\(tj] r o \ (i-p(t>) o “I r 2l ( t ) 

z 2 (t) ** -|(l-p(t)) 0 \ (1+P(t)) z 2 (t) 

z 3 (t) 0 j (l+p(t)) 0 _ z 3 (t) 

This equation evolves on the pseudo-orthogonal group S0(2,l). 

One particular fact which should be mentioned here is that 
systems with a single time varying parameter, say 

x(t) = Ax(t)+k(t)Bx(t) (*) 

go into systems with a single time varying parameter e.g. 

x lpl (t) - (A [p] +k(t)B [p] )x [pl (t) 

Thus the many useful results about (*) (circle criterion, [17], 
etc.) can be extended in a nontrivial way. 

Exercises 

1. It is known that all solutions of the differential equation 

x + x + k(t)x(t) =0 

remain bounded if 0 < k(t) € -3.9 (see [17]). On the other hand, 
if one picks a positive definite quadratic form in x and x say 
v(x,x) and computes its derivative along solutions of the given 
differential equation then there exists one quadratic form which 
implies stability via Liapunov theory, for 0 ^ k(t) £ 1 but the 
constant 1 cannot be improved on using a quadratic Liapunov function. 
However, if we look at the x^p] version of the differential equation 
then a quadratic Liapunov function for x^Pl is a 2p-degree Liapunov 
function for the original equation and a more suitable Liapunov 
function can be found. Work out the details. 

2. Consider a differential equation in-jR, n 

x(t) » Ax(t)+k(t)Bx(t) 

Suppose that A and B generate a four dimensional Lie algebra which 
is isomorphic with g£(2). Use the theory of the representations 
of g£(2) (see, e.g. Samelson [A] page 114) and the circle criterion 
(see, e.g. [17]) to derive stability criteria for the given system. 

5.2 Periodic Self-Contragradient Systems 

A matrix Lie algebra is said to be self-contragradient if 
there exists a matrix P such that 

PLP _1 « -L' 

for all L in the Lie algebra. For example, so(n) is self-contra- 
gradlent with P=I and sp(n; is self-contragradient with P-J. As 
far as the stability of periodic systems is concerned, the impor- 
tant consequence of this assumption is that if A(t) satisfies 
PA(t)P“* = -A'(t) then the transition matrix for 
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satl8fies 


since 


x(t) - A(t)x(t) 

^( t )P$ A (t) - P 


#;(t) ? vt)) 

.-Is 


^(t)(A'(t)P+PA(t))<I> A (t) 


Thus $ A (t) similar to ($ A )’. As an immediate consequence of this 
fact we see that the eigenvalues of $ A (t) occur in reciprocal 
pairs — if X is an eigenvalue then so is 1/A. If we assume we 
are dealing with real systems then of course the eigenvalues 
occur in complex conjugate pairs as well. 

If A(t) «* A(t+T) then the well known Floquet theory insures 
that the transition matrix for 


x(t) - A(t)x(t) 


can be expressed as 

V*> 


Q(t)e Rt ; 


Q (0) - I 


with Q(t+T) ■» Q(t) and R constant ) though not necessarily real. 

The value of $ a (T) is decisive as far as the stability of a 
periodic system is concerned since $ A (nT) - [$ A (T)] n . 

If A(t) is given b^ 

A(t) - l a. (t)A. 

i-1 1 1 

with the A. being a basis for a self-contragradient representation 
of a Lie algebra, then of course 

$ A (t)P$ A (t) " P 

for all t. If ($ A (T)) n is bounded for n a l,2,... then we call 
*A<T) stable. We call it P-strongly stable if it happens that for 
all sufficiently small R such that R'P+PR ■ 0, the matrix e^J>^(T) 
is also stable. (Compare with [31].) In view of the fact that 
the eigenvalues of a matrix depend continuously on the elements of 
the matrix and in view of the fact that the eigenvalues of $ A 
must occur in reciprocal pairs, we see that if the eigenvalues of 
$ A (T) are distinct and if $ A CO is stable, then it is P-strongly 
stable. However it can happen that <J> A (T) is P-strongly stable 
even if the eigenvalues of $.(T) are not distinct. 

Theorem 1 : If (A , } is the basis for a self-contragradient matrix 

Lie algebra, A^P+PA^ ■ 0, and if 

m 

x(t) - ( l a i (t)A i )x(t) 
i“l 

is periodic and if (T) is P-strongly stable, then there exists 
e > D such that for |bi (t)-a^(t) | < e and b^(t) periodic of period 
T the system 

x(t) » ( l b^OA^xCt) 
i“l 


is stable. 
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Exercises 

1. 


M - 


Determine 

if 

for P - 

J the 

matrix 

cos 6 

0 

sin0 

0 


0 

cos 

0 

sin0 


-sin0 

0 

COS0 

0 

; o 

0 - 

sin 

0 

COS0_ 



< e < ir 


is P-strongly stable or not. See [30], theorem 8. 

2. Show that if p(t) is periodic of period T with average value 
zero and if 




0 l lTxj^Ct) 



.-1 -p(t)J 1 x 2 (t). 


then 4 > a (T) is symplectic although <J^t) for t ^ T need not be. 
The corresponding xt 2 l equation is expressible as 


_d 

dt 


x 2 

1 

x l x 2 

x 2 

L x 2 J 


0 2 0 

-1 -p(t) -1 

0 -2 -2p(t) 


k l 

V 

i J 


X 1 X 2 


Use the idea of strong stability to investigate the stability of 
these systems. 


3. If D is diagonal then D+H is similar to a diagonal matrix if H 
is any symmetric matrix. However if D is diagonal there may exist 
an n(n-l)/2 dimensional set, the upper triangular matrices, such 
that DfT is not diagonalizable; consider the identity. Relate 
this to strong stability. 


5.3 The Symplectic Case 


In the special case of the symplectic group Krein [30] has 
given an elegant theorem on how large the perturbation in Theorem 1 
of the previous section can be. We give an application of this 
theorem and some facts about realizations of feedback systems as 
well. 

Notice that the second order system with Q(t) symmetric 
x(t) + Q(t)x(t) = 0; x(t) c IR n 
is equivalent to the symplectic system 

r*i<t)i r o ii rx l( t) 
l-Q(t) oj [x 2 (t). 

Krein has investigated this set of equations and more general ones. 
One of his results reads as follows. 

Theorem 1 : Let P(t) ■ P(t+T) ■ P'(t), then all solutions of the 

equation 
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x(t)+P(t)x(t) - 0 
are bounded if 


i) 

ii) 

iii) 

Proof: 


P(t) iO all t 

r T 

P(t)dt > 0 (positive definite) 

0 T 

(4/T)I-[ P(t)dt > 0 (positive definite) 

Jo 

See Krein [30], page 165. 


As an example of an application of this result to problems 
of the type which arise frequently in system theory we prove the 
following theorem. (Compare with [32]) 

Theorem 2 ; Suppose that q(s) and p(s) are polynomials without 
common factors. Suppose further that q(s)/p(s) is an even 
function of s with all its poles and zeros on the imaginary axis 
and assume that the poles and zeros of sq(s)/p(s) interlace. Let 
D « d/dt and let k( ) be periodic of period T. Then all solutions 
of the nth order differential equation 

p(D)x(t) + k(t)q (D)x(t) ■ 0 

are bounded provided 

0 < | |X(t) | 2 dt < 4/T 

where X(t) denotes the zero of p(s)+k(t)q (s) “ 0 which has the 
largest magnitude. 2 2 

Proof : Write q(s)/p(s) as r(s ;/m(s ; with r and m polynomials. 

This is possible because q(s)/p(s) is even. Write r(s)/m(s) as 
b'(Is-A)~lb with A - A’. This is possible because the poles and 
zeros of r(s)/m(s) interlace^ (See [25]). Thus 

q(8)/p(s) - b'(Is 2 -A) 1 b 

and the differential equation in the theorem statement is 
equivalent to the system 

x + (A+k(t)bb')x(t) - 0 

Krein' s result implies stability if 

KT/4) - f T (A+k(t)bb ' )dt > 0 
J 0 

But since the largest eigenvalue of the sum of two positive definite 
symmetric matrices is less than or equal to the sum of *-he largest 
eigenvalues of the respective matrices there is a corresponding 
inequality for .integrals and we see that 
*T X 

W | o (A+k(t)bb')dt < | (X(t> 1 
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The result then follows. 

It is interesting to compare this result with the analogous 
facts about completely symmetric systems investigated in [25]. 

Also notice that this theorem captures Liapunov's original theorem 
as a special case, as does the basic theorem of Krein. 

Exercises 

1. Use these results to investigate the stability of the scalar 
equation 

x (A) +4x (2) +3x+k(t) (x (2) +x) - 0 
with k(t) periodic. 

2. Derive a matrix version of Theorem 2. 
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1 . INTRODUCTION 

Electrical networks utilizing electronic switching are used to obtain 
a variety of results which are difficult or impossible to obtain with conven- 
tional linear circuits. Typical applications include circuits which perform 
elementary control functions, circuits for DC to DC voltage conversion, 
circuits for frequency conversion, etc. We will be mostly concerned with 
networks designed for their power handling capability. A reasonably complete 
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storage elements play a significant role. The main thrust of our paper is 
in this direction. 

The most natural description of the basic equations of motion for 
circuits of interest here is a set of first order differential equations 
describing the time evolution of the inductor currents and the capacitor 
voltages. We find that in many cases networks containing diodes, controlled 
switches, linear time-invariant inductors, capacitors and resistors, current 
and voltage sources, can be modeled by equations of the form 

ro m 

x(t) = A q + u i (t)A i )x(t) + b Q + l u 1 (t)b 1 

where u^t) model the effects of switches. Because the right side of this 
equation contains a set of bilinear terms, it is often referred to as a 
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bilinear system. It serves as the starting point for the analysis of 
the dynamical behavior of the electrical network. Next we introduce a 
set of approximations — based on averaging — which yield a family of 
bilinear equations in which the average values of the switch variables occur. 
The approximate systems can then be used to design control strategies, either 
based on linear lization and conventional frequency response or else some of 
the new stabilization methods introduced here in section 5. The key step 
is the basic averaging approximation and the various refinements of it. It 
is at this point and in the treatment of bilinear equations that we make 
some use of Lie theory. 

The paper concludes with the detailed analysis of an example illustrating 
the use of the techniques discussed in the design of pulse-width modulated 
regulators for DC to DC conversion systems. 
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2. MODELS FOR DIODES AND CONTROLLED SWITCHES 

In the later sections we will want to model all circuits under 
consideration by circuits which contain only linear time invariant inductors, 
capacitors, and resistors, sources and ideal switches. By an ideal switch 
we understand a circuit element which either transmits no current, regardless 
of the voltage drop across it, or else has no voltage drop across it regardless 
of the current through it and it can change from one of these modes to 
the other on command — regardless of the current it is carrying. 

Our first point is that ideal switches are actually good models 
for a large number of circuits which are easily built. It is true that 
silicon controlled rectifiers and power transistors are only approximated by 
ideal switches in certain regimes because of their turn-on and turn-off 
dynamics, the fact that they conduct in one direction only, etc. However, 
there are simple circuits for making these devices bidirectional such as 
the one shown in figure 1. This figure shows a diode bridge with a uni- 
directional device in the middle. The overall circuit will conduct current 
in both directions and can therefore be modeled by a simple switch. The 



Figure 1 : A bidirectional switch made with 4 diodes and a uni- 

directional switch, together with its equivalent. 

turn-on and turn-off dynamics, while important in some applications, will be 
ignored here. We justify this on the grounds that these effects are second 
order compared with the analysis considered here. The restriction to 
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bidlrectional devices is justified on the grounds that the presence of 
unidirectional devices would complicate our analysis and furthermore they 
can be eliminated — sometimes with good effect on system performance — 
via the circuit in figure 1 or some modification of it. 

We also want to consider three terminal switches of the form represented 
by figure 2. These can be realized in various ways depending on whether or 
not fully bidirectional behavior is required. A more or less typical 
example is the circuit shown in figure 3. It is accurately modeled by 

Y ' 


Figure 2 : A three terminal switch. 

replacing the controlled switch and the diode by a three terminal switch 
provided the current is always flowing through the inductor in the positive 
sense. 



Figure 3 : A nonlinear network and a controlled switch 

model incorporating a three terminal switch. 

A general approach to modeling circuits with switches cannot be based 
on impedance descriptions since the ideal switches do not have impedance 
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characterizations. However a scattering variable description of a switch 
does exist and it is desirable to base the whole approach to modeling 
networks with switches on a scattering variable formulation. That is, 
if we have a network with switches and sources we extract the switches and 
relate i+v and i-v across the switch via 

(i+v) = u(t) (i-v) 

When the switch is closed i+v = i-v and when it is open i+v = -(i-v). 

Thus the switch value, u(t), is either plus or minus one. 

Based on the above discussion we claim that a good understanding of 
networks of the form shown in figure 4 would contribute to our ability to 
design useful circuits. Moreover one easily sees that subject to the basic 



Figure 4 : The general time invariant switchable electrical 

network with two and three terminal switches. 

well-posedness conditions, such as arise out of the necessity of not having 
any capacitive loops or inductor cut sets regardless of the switch configura- 
tion, such circuits always yield state equations of the form 





We refer the reader to standard sources for the justification of this 
remark. See [1] for background and references. 
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3. CYCLIC PROCESSES AND VECTOR FIELDS 

In section 4 we will discuss certain simplifications for commutated 
electrical networks which operate in a quasi-periodic mode. In order to 
motivate the type of analysis which is carried out there we want to 
discuss the role of switches from a particular point of view which has 
to do with the cyclic nature of the processes in question. 

The idea is illustrated with the circuit shown in figure 3. This 
circuit can be regarded as a model for a simple voltage converter. We 
consider the time evolution of the inductor current and capacitor voltage. 
In terms of these coordinates we have the two different types of integral 
curves, depending on the switch position. The choices, are shown in 
figure 5. By following alternatively the paths we can effectively stand 


inductor inductor 





Figure 5 : (a) switch closed right; (b) switch closed left; 

(c) typical cycle 


still at an average position. There are, at the next level of complication 
circuit effects which cannot be explained on the basis of simple averaging. 
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In general what one does is to allow the system to follow alternate paths 
(vector fields) In a definite order, thereby creating effects which are 
not achievable by following any one of the fixed available paths. This 
point of view is the basis for studying the controllability of nonlinear 
systems. See the recent paper of R. Hirschom [7] for a systematic 


account. 
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4. THE MATHEMATICAL ANALYSIS 

We now turn to the analysis of circuits modeled by equation (1) . 

This will require certain results from the theory of linear systems 
and Lie algebras. All necessary background material can be found in 
[2] and [3]. 

If A and B are n by n matrices then we use the bracket symbol [A,B] 
to denote the commutator product 

[A,B] = AB-BA 

a subset of the space of n by n matrices which is, (a) a linear space, 
and (b) contains [A,B] whenever it contains A and B, is called a matrix 
Lie algebra. If A and B are any two n by n matrices then we can find 
the smallest Lie algebra which contains them simply by forming their 
commutator product, taking linear combinations, more commutator products, 
etc. This process stops in a finite number of steps because the set of 
n by n matrices is finite dimensional. We denote the smallest Lie algebra 
which contains A, B, ...» C by {A,B,...,C} . 

Given a differential equation system 

x(t) = (A+u(t)B)x(t) + u(t)b + a (1) 

it is known (see, e.g. [3]) that the transition matrix $ A (t) must belong 
to the set 

hi h_ L 

r r . ^ ^ r — .. ± Z. Tu _ r . _ ^ ■* 

texptA,B/ LA J G - ia: a = e e e...e , e 

Moreover, it is known that "fairly large" subsets of {expfAjB}^}^ can 
in fact be achieved. See the recent paper of R. Firschorn [7] and 
his references to the work of Jurdjevic and Sussmann. 
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Suppose that we have a linear time invariant system of the form 
x(t) = A o x(t)+B Q u(t) ; y (t) = C Q (t) 

with (A ,B ,C ) a controllable and observable (minimal) system. If A(*)> 
o o o 

B(-) and C(*) are periodic functions of time such that ||A(*) - A o ||, 

| | B C * ) — B j | and ||C(*)-C || are all less than e. Then for e sufficiently 
small the periodic system 

x(t) = A(t)x(t)+B(t)u(t); y (t) ■* Cx(t) 

will be minimal as well. The input-output map for the periodic system, 
i.e. the map — — >y defined by 

y(t) « [ C(t)<J>.(t,a)B(a)da 

Jo A 

* 

will be close to that of the time invariant system provided that both 
are asymptotically stable. This suggests that one miglt replace the periodic 
system by a time invariant one obtained by averaging over one period provided 
the resulting system is both stable and minimal. There are two basic pro- 
perties of any such approximation which one should demand: (a) if we replace 

t by t+a the approximation should not change, and (b) A q should belong to 
the Lie algebras generated by {A(t^)}, so that the "average" system is 
not exhibiting behavior that the original system could not duplicate for any 
choice of u. This is significant, for example, if one is to avoid pitfalls 
such as approximating a lossless network with a lossy one, etc. 


Say with respect to the operator topology induced by letting u and y 
belong to L2[0,°°). 
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The approximation based on simple averaging becomes less satisfactory 
as the variation about the average becomes larger or as the period of the 
variation becomes larger. In these cases one wants to refine this approxi- 
mation further. We describe how this can be done in the piecewise constant 
case. Notice that given two real matrices A and B there is no guarantee 
that there exists a real matrix C such that 

ABC 
e e = e 

This will be the case however, if A and B are small enough and then C will 
be given by a convergent infinite series 

C - A+B+[A,B] + [ [A,B] ,B] + [ [B,A] ,A] + . . . 

known as the Baker-Campbell-Hausdorf f formula. Various expressions like 
this actually form the basis for a large number of useful approximations in 
physics [4]. What we find here is that they can be quite useful in the 
analysis of electrical networks as well. 

The idea is this. Suppose that A(t) is given by A(t) = A(t+T) and 

{ A 0 i t c a 

. 

B a < t c T 

Then we want an approximation for e^e^^ a \ The Baker-Campbell-Hausdorff 

C 

formula gives such a result, namely e where C is as above. Now if we want 
a formula for C which is independent of order, i.e. insensitive to a shift 
of origin of the time axis, then we must drop out the [A,B] term to get 
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An |(e A e B +e B e A ) * A+B + y| f[A,B]B] + [[B,A],A] 

= A+B+^[[A,B],B-A] 

Putting these ideas together we obtain a series of approximations for 
piecewise constant periodic systems. If the system equations are 

x(t) = A(t)x(t)+Bu; y ■ Hx 

with A(t+T) «=> A(t) and A(t) as above, then the first approximation is 

x(t) = [^ A + (1- y )B]x(t)+Gu(t) ; y(t) ■- Hx(t) 

The second approximation is 

2 

*<t) - tf A+(l- f)B+ (| - ~)[[A,B] f (l- f)B- fA] ]x(t)+€u(t); 

y(t) •= Hx(t) 

and higher degrees of approximation can be generated by taking more terms 
in the Baker-Campbell-Hausdorff formula. 

Notice that the inhomogeneous equation 

x(t) = A(t)x(t)+b(t) 

can be converted to the homogeneous form 



by the simple device of augmenting the x-vector. We will use this trick 
when it is necessary to approximate the solutions of 
x(t) = Ax(t)+u(t)Bx(t)+u(t)b 

in subsequent sections, since it allows us to use the Baker-Campbell-Hausdorff 
formula directly. 
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5. PULSE-WIDTH MODULATED SYSTEMS 

By a pulse-width modulated system we understand a special type of 
bilinear system of the form 

m m 

x(t) * (A+ l u (t)B)x(t)+ £ u (t)b.+c 
i-1 1 i-1 

where each u^(t) can take on only two values. Moreover, there is a 
basic pulse period for the system and u^(t) can only switch between its 
two possible values one time in each period. Thus if u switches between 
one and zero then a typical u(*) looks as shown in figure 6. 



system of pulse period equal to one. 

A pulse width modulated system is a bilinear system and if one averages 
over one period the averaged pulse-width modulated system is also a bilinear 
system. The significant difference is that whereas the original system has 
controls which take on only two values the averaged system has controls 
which take on a continuum of values. This change makes the control problem 
very much easier to study using conventional techniques. It also allows one 
to apply the theory of bilinear systems [5]. 
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6. STABILIZATION 

In section 7 we want to show, by example, that the methods of this 
paper can be useful in understanding pulse width modulated control systems. 
However, we also want to indicat how these methods can be used in design. 

For that reason we examine the question of stabilizing bilinear systems 
by feedback. 

Consider the bilinear system 

x(t) = Ax(t)+u(t)Bx(t)+bu(t) 

There are 3 more or less obvious remarks to be made about the existence 
of feedback control laws which make x = 0 asymptotically stable. 

(i) If b, Ab,...A n ^b is of full rank then by virtue of the pole 
relocation theorem for linear systems there exists a linear feedback control 
u = -k'x such that A-bk has all its eigenvalues in Re s < 0. Since k'xBx 
is quadratic in x it follows that x=0 is (locally) asymptotically stable for 

x(t) = Ax(t)-k*x(t)Bx(t)-bk 'x(t) 

(ii) If *(t) = Ax(t) Is asymptotically stable then the control law 
u = 0 results in stability. If x(t) = Ax(t) is stable but not asymptotically 
stable then there exists a nonsingular symmetric matrix Q such that 
QA+A'O = 0 and the control 

u ( t) = - x* (QB+B * 0)x— cx ( t ) 

gives asymptotic stability unless u vanishes identically for some non- 
decaying solution of x(t) = Ax(t) . 

(iii) If there exists a choice of basis such that 
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A 11 A 12 


B 11 B 12 


'c ' 


; b = 


; c = 


•° A 22- 


.0 b 22 . 


.0. 


and if 


x(t) = A 22 x(t)+u(t)B 22 x(t) 


meets the conditions for instability for the circle theorem [2] or some 
other instability criterion then there is no stabilizing control law. 

We now describe a refinement of (ii) which is well suited for the 
study of the electrical network problems we have been discussing. 

Theorem : Suppose that A is similar to a skew symmetric matrix and suppose 

that the eigenvalues of A satisfy the condition X^+Xj f X^ for all i, j and 
k. Assume that b, Ab,...,A n ^b are linearly independent. Then there 
exists an n by n matrix 0 with Z => O' >0 such that OA+A'O ■= 0 and the 
control law 

u « -x' (OB+B’Ox-x'Ob 

makes the null solution of 


x(t) = Ax(t)+u(t)Bx(t)+bu(t) ; x(t) e 
asymptotically stable in the large. 

Proof: The existance of a 0 satisfying the given condition is classical 

(see e.g. [2]). We want to use the Liapunov function x’Ox and a theorem 
of LaSalle which gives asymptotic stability if x'Ox is positive definite 
and v is not identically zero along a nonzero trajectory. In this case 

O 

• t- 

V = -u 


Now if v is identically zero then u is zero and x = Ax. We also have 
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x'OB+OB’x = -x'Ob 

At a * t 

but if u vanishes then x(t) = e x . Since OA = -A*Q we see x'e Ob = 

o 

x'Qe and, by the condition on b,Ab,...,A n this does not vanish 
identically for x q ^ 0. Thus we must have 

i A' t At , A't., 

x e (OB+B 0)e x = -x e Ob 
o o o 


with both sides nonzero. Now the left side is a sum of terms of the type 

Aft 

a e and confluent forms where A. are the eigenvalues of A. The right 

(A +A*)t 

side is a similar sum of B^e 1 . By hypothesis the exponentials 

on the right and left are distinct. Since the left side does not vanish 
identically they are not equal. 

It may be that the conclusion holds under the weaker condition that 
there should exist no vector x 4 0 such that ax+x' (0B+B'0)x(Bx+b) = 0. 

This condition is certainly necessary and perhaps it is sufficient as well. 
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7 . EXAMPLES 

This section consists of an example illustrating the application 
of the analysis done in sections 4 and 6. We are especially interested 
in determining the effects of going to higher order approximations. 

We consider the network shown in figure 3b. The equations of motion 
are, assuming a one volt supply with positive polarity down we have 


"Lx' 


’ 0 

l-u‘ 

X 

+ u 

'-l' 

.Cy _ 


_u-l 

-R. 

.y. 


0 . 


(*) 


Here x is the inductor current, v is the capacitor voltage, and u=l when the 
switch is closed on the left and 0 when it is closed on the right. If we 
assume u is operated periodically then the first approximation is 



(**) 


where u is the average value of u. We can associate a time invariant 
network with these equations in various ways. If we want to preserve the 
meaning of z^ as the voltage across the resistor then the network in 

A 

figure 7 is appropriate. 


u 

1-u 


_Q_&- 

L (1-u) 2 




Figure 7 : Network equivalent of the averaged equation. 
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If u is periodic and of the form shown in figure 8 then we can refine 
this approximation by taking additional terms in the Baker-Campbell-Hausdorff 
formula. Introduce A and B, taken from equation (*), as 


' 0 

1* 


■ 0 

-1' 



B = 



-1 

-R. 


_ 1 

0_ 


The next approximation is then 


£n 1 {e a(A+B) e (l-a)A +e (l-cx)A e a(A+B) } 

= A+aB + -j| {a(l-a) (l-2a) [ [B,A] ,A]+a 2 (l-a) [ [A,B] ,B] } 


1 

-2R R 2 " 


2R 0 

= A+aB + — a(l-a) (l-2a) 

9 

+a z (l-a) 


_ -R +2R _ 


.0 -2R. 


Near a 50% duty cycle the second correction term is more important than the 
first, x^hich, in fact, vanishes at a = 1/2. The second term has the effect 



Figure 8 : The duty cycle for switch in figure 3. 

2 

of decreasing the output resistor by a (1-a) R/12 and inserting this same 
value of resistance in series with the inductor as shown in figure 9. The 
actual percentage change in the output resistor is about 2% but this 
together with the insertion of the small resistor in series with the 
inductor has a notable effect on the frequency response characteristics. 
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Figure 9 : A better approximation for the network 

in figure 3. 


Based on the first approximation described by (**) let's look for a 

* 

control law which stabilizes the output voltage to a value z 2 > 0. This 
means that for steady state we must have 


0 1-u 


Lu o -l -R J 


r * 

h 
* 
Lz„ 



'-1 ■ 


"O' 

+ u 




o 

0. 


.0. 


This fixed the values of z. and u 

1 o 


u 


o *, 

z 2 +l 


* Rz 2 * * 

“ Z ° -Rz 2 (z 2 +1) 

u -1 
o 


Say that z„ = 4 and R = 1. Then u » A/5 and we want to stabilize at 
L o 


i- * 

Z 1 

* 

u Z 2 


-20 

4 
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Introduce z-z = y and u-u = v. In these coordinates we have 
1 o 

Ly 1 " 1 [ 0 -i/siryl o -ilfyj r-1 

= + V + V 

cy 2 J L -1 /5 — i J L y 2 -I L+1 o J L y 2 J L 0 

If we now linearize this equation about y *» 0, v = 0 we see that the 
transfer function between the output voltage error 




This linearization is obtained by ignoring the vy^ and vy£ terms. This 
is the relevant transfer function for feedback regulator design, regarding 
the average switch position as the input and the deviation of the output 
voltage from its average value as the output. If necessary one can now 
return to the refined approximation and work out a more accurate equivalent 
model . 
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8. CONCLUSIONS 

We have shown here that commutated electrical networks can be 
analyzed in an approximate way by using an averaging technique based on 
first order differential equation descriptions and the Baker-Campbell- 
Hausdorff formula from Lie theory. There are three steps in the 
analysis: 

(a) replace all unidirectional switches by bidirectional equivalents 
or bidirectional approximations valid in the operating regime, 

(b) introduce equivalent circuit equations based on averaging and 

A B 

expansion of e e , 

(c) stabilize the resulting bilinear equations using linearization 
or bilinear theory. 

An example is given to indicate the type of insight available from this 
approach. 
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IV. SUMMARY AND CONCLUSIONS 


R.W. Brockett 

1. Main Results 

NASA support has resulted in the work leading to the publications 
[1-25] cited in the reference list. While a detailed summary of this work 
is obviously impossible there are several main lines of thought which are 
apparent. We list these with an indication of their origin. 

1. High efficiency power conversion networks are usually well approximated 

by electrical networks with linear inductors, capacitors, resistors (for load), 
sources, ideal switches and diodes, together with the control circuitry for 
the switches. [1-5] 

2. Switched electrical networks of this type are, without the control 
circuitry, bilinear systems. [1,3,4] 

3. Many aspects of the behavior of bilinear systems can be understood on 
the basis of mathematical models without resorting to simulation. The study 
of controlled bilinear systems such as arise with regulated DC to DC supplies 
in which linear or nonlinear feedback is applied to a bilinear system is 
more difficult but design for stability is possible based on mathematical 
models. [2,5] 

4. The use of Lie algebraic techniques is essential for understanding the 
controllability and observability of bilinear systems. Moreover, these same 
techniques carry over with little change to more general nonlinear systems. 

This last point is important in understanding the feedback control of bilinear 
systems since even linear feedback leads to systems which are no longer bilinear 
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5. The standard use of averaging to approximate the behavior of switching 
regulators can be refined using Lie algebraic techniques. This refinement 
is useful when the natural frequencies of the regulator and the clock 
frequency are not widely separated. [5] 

Taken together the methods developed here constitute a basis for 
understanding some of the theoretical problems which arise in the study 
of power conversion networks. The idea that Lie algebras shed some light 
on the control of switchable electrical networks is felt to be one of the 
major contributions. A priori there was no hint in the literature that 
this might be true. Equally important is the idea that the basic tools 
of modern control theory, e.g. state space models, Liapunov stability 
methods, optimal control, etc., can be of practical value in designing 
control laws for converters and regulators. To be sure, this latter idea 
is becoming widely recognized, and our work only reinforces an established 
trend. However our mathematical methods can only go so far toward the 
solution of the design problems and further interactions with system 
designers should be useful in refining the methodology generated so far. 
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2. Difficulties 

The principle remaining difficulties in analyzing nonlinear 
equations of the type which occur in power processing problems lie in the 
area of : 

1. Free running converters for which the clock speed is not a priori 
fixed but which depends on the load and supply conditions. 

2. Converters which face widely changing load conditions such as would 
cause the system to change its basic mode of operations. 

3. The design of multimode converters controlled by finite state systems 
of considerable complexity. 

For those applications in which reliability considerations outweigh 
cost, it seems likely that more sophisticated digital control circuitry 
will become more common. This will make item three very important in this 
context. It also seems likely that an increasing number of applications 
will be found where efficiency is the overriding consideration and for these 
cases sophisticated digital control circuitry may be justified also. 

Though it has been recognized for a long time that there is a real need 
for a theory of systems which are partly continuous and partly finite state, 
results have been slow in coming. It may be that previous efforts have 
addressed the problem in too much generality and have not exploited the 
special features of the known successful applications. In any case this 
problem seems too important to ignore in spite of the apparent difficulty. 
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3 . Future Work 

The main hope for further simplification in this area of nonlinear 

analysis rests in finding a suitable extension of the transfer function 

* 

idea. Recent work on Volterra series indicates that the Volterra kernels 

for the input-output map for systems governed by ordinary differential 

equations can be computed rather easily and can be of use in understanding 

the behavior of systems. This idea has already been worked out in detail 

by d Alessandro, Isidori, and Ruberti for bilinear systems and it seems 

to hold great promise for future developments. 

It is also clear that more work should be done which recognizes 

explicitly the role of logic elements in the controller. This is a 

difficult problem area but one of great importance. 

Finally, in view of the great importance which one must place on 

efficiency it seems that more emphasis should be placed on the development 

f 

of fundamental bounds on efficiency. We feel that the work of Wolaver 
on fundamental limitations on converter circuits is an excellent start 
and that this line of work deserves more attention. 


R.W. Brockett, "Volterra Series and Geometric Control Theory," Proc . 
International Federation on Automatic Control, Boston, Mass. 1975. 

P. d' Alessandro, A. Isidori, and A. Ruberti, "Realizations and Structure 
Theory of Bilinear Dynamical Systems," SIAM J. on Control , (to appear). 

+ 

D. Wolaver, Fundamental Study of DC to DC Conversion , Ph.D. Thesis, 
M.I.T., Dept, of Electrical Engineering, June 1969. 
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